A Case for Metaprogamming

PWQ Overview 20 Jan 2004

While Tcl does have some facilities for metaprogramming by virtue of the language, there are others that could be added.

Case 1

Let's examine the use of the $ substitution.

The feather extension proposes that the notation $text be replaced with the subcommand syntax [set text].

This is primarily to support the replacement of the set command and ensure that $ behaves the same.

Let's take that one step further:

Consider the following pseudo scalar references:

    ${[somefunc $x y $z]}

None of the above can be used as references to scalar values in the current implementation of the parser.

The parser

Let's for a moment assume that parts of the internal parser are exposed to the script level. We could then have the opportunity to plug in our references. For example this may look like this:

     parser -char $ -replace {$([^ ]+)} -with {[set \\1]}
     parser -char $ -replace $([^(]+)\(([^,]+)\)\(([^,])+\)\(([^)]+)\)} -with {[araygetthree \\1 \\2 \\3]}
     parser -char $ -replace {$\{\[([^]])\]} -with {[funcset \\1]}

Other Languages

The usefulness of metaprogramming to emulate other languages is not quantified. Personally I do not see the benifit of language emulation.

Consider emulating forth; here is an example:

        2 dup + s" The answer is " s. .

This cannot be parsed by the tcl parser directly. The form:

    eval [join {2 dup + s" The answer is " s. .} \n]

would work for every token except s" as this is an immediate command that forward scans the text to the next " character. Thus the forth parser only sees s" s. . in the input.

Using the parser we may be able to install something like:

   parser -token s" -extend {[^"]*"}
   parser -eol { }

The use of [subcommand] notation would also have to have a mechanism to be overridden. This would complicate the expansion of $ variables unless there were a -command option to the parser that forces evalution of the replacement text regardless of the meaning of [.

Regular expressions

The use of regexps in the example is for familiarity only. Regexps are useless for processing recursive, or paired (brackets within brackets et al.) definitions.

A more useful format would be snobol string matching, or Objective C selector/method encoding.


Given that the TCT are never going to make arrays first class variables, the ability to create new $ variable encodings would go a long way to address programmers' desires to emulate them. It also allows the encoding of structures more efficiently as a call to [func struct member] can be more efficient than having to parse the array reference every access (example [$dict member] vs $array($y,member)).

The ability to change the meaning of $ notation would be a facile change to the core. Other substitions would require extensive changes to the parser and it is unlikely that this would be undertaken.

RS 2004-01-26: I'm not sure who wrote this page, but here's my comments:

  • Tcl started out with only [set x] for variable dereferencing. The $x shortcut, as known from shells, was added later for convenience, and it sure is: 1 keystroke in place of 6... Just like 'X for (QUOTE X) in Lisp.
  • Your leading example can be parsed well in Tcl if you write it as
  $Array($x,y,$z) ;# or any other delimiter that is not part of subkeys

PWQ In this example the reference would be $Array($x)->y->$z. , which cannot be represented above as you cannot store an array in an array subscript.

  • The second case needs an explicit set to work:
   [set [somefunc $x y $z]]
  • The Forth example would not run when evalling \n-delimited words, because constant values are indistinguishable from commands. But if you drop the s" ... " construct with lookahead, and just allow any string as word, minimal RPN with some more work can well handle
   2 dup + "The answer is " . .
  • Tcl arrays are collections of variables, somehow like namespaces, but again with a simpler syntax: $x($y) / $x::y. For "first-class value" arrays, dicts will be ready from 8.5, and will take over much of what is being done with arrays today. Also, lists are a good representation for many uses where other languages have (their) "arrays".

PWQ Again, dicts have their place, and if one could use the notation of $dict(key) instead of [$dict get key] then I would agree. Another case for including meta programming.

  • Being open source, everybody is free to take Tcl and modify it to his delight. But the TCT has to see that changes to the Tcl language do not make existing scripts break, and disappoint a considerable user-base worldwide. PWQ nothing in the MP examples I gave requires breaking compatibility
  • The language of regexps is not always a joy to behold, but it's concise, powerful, and with inline comments it can be made better readable. Plus, it's sort of standard - no big context switches are needed for grep, sed, awk users. Snobol and Objective C are interesting in the history of computing languages, but not really relevant today PWQ the same argument could be made about TCL. The Tcl heritage with elements from Lisp, C and shell make Tcl easier accessible for persons with experience in those languages.

PWQ I don't see the inclusion of MP as being bad. Some people just want to have more parts of the internals exposed to the script level. - RS agrees with both points - and thinks, Tcl as it is is quite fantastic already for metaprogramming and functional programming.

Lars H 2004-01-26: "Exposing" the parser as suggested, which in reality would mean making it configurable, could also slow it down quite considerably. As for "nothing [of the above] requires breaking compatibility", this seems to overlook the breaking of compatibility that arises when some construction, such as for example

  lappend $Array($x)($y) 0

where the first parenthesis is meant to be substituted by the parser but the second is not, suddenly gets a new interpretation. But perhaps the intent was only that one should be able to get this incompatible behaviour by reconfiguring the interpreter using script level commands, not that this should be the default behaviour.

There is however a way to achieve the above goals without changing the Tcl parser one bit, through a devious use of the little language concept: "Syntax extended Tcl" can just be another little language! Since a Tcl script is just a string, it is quite possible to parse it and convert all extended syntax features used to plain Tcl, before the script is fed to anything like proc or eval. (Such a custom parser could for example be constructed from the parsetcl one.) Then one would write code like

 xproc Axy {x y} {
     global Array
     return $Array($x)($y)

and have it automatically rewritten as

 proc Axy {x y} {
     global Array
     return [dict get $Array($x) $y]

(or whatever the double indexing is supposed to mean) by the xproc procedure. Since each procedure body only has to be rewritten once, it probably won't matter if the rewriting code is slow due to being written in Tcl, but in case it does matter then it is always possible to write the parser in C instead.

xk2600 Just a thought... exposing the parser among other internals seems like a reasonable enough request, but to prevent compatibility issues, and possibly allow for the ability to develop test environments for such concepts, would an extension to interp not be a good place to implement such a feature? Then just like a safe interpreter you could have a modified interpreter running as a slave. In theory, the parser could then preprocess the script prior to executing in the slave. At the end of the day, the real performance cost is lost when the code must be parsed, but once its been converted to bytecode one would think it doesn't matter at all what language the user writes the code in.

I'm envisioning the ability to utilize any linguistic syntax one could imagine with it's own translator into TCL bytecode representation for a fast execution path as well as the ability to take the slow path through the TCL parser and interpreter when its not possible for the parser to convert to bytecode, for example in the case of a dynamic procedures.