Version 9 of Sugar command macros

Updated 2004-03-24 22:24:07

* Section 0 - Sugar

What is a Tcl macro.

A macro is an operator implemented by transformation. Macros are procedures that generate a Tcl program at compile time, and substitute it where the programmer used thier name as command.

It's a simple concept if explained by examples.

Suppose you want to write a [clear] command, that set the variable name passed as it's unique argument, to a null string. Implemented as a procedure using upvar it is like this:

 proc clear varName {
     upvar 1 $varName var
     set var {}
 }

Every time the user type

  clear myvar

in a program, the [clear] procedure will be called, and the variable myvar of the caller procedure will be set to a null string.

As an alternative to call a procedure that is able to alter the caller's execution environment, we may want to automatically substitute every occurrence of the command [clear <varname>] with [set <varname> {}].

So basically we want that when we write

  clearn myvar

in a program, it is substitute with

  set myvar {}

as if the programmer had really typed "set myvar {}" instead of "clear myvar". That's the goal of the simplest form of a Sugar's macro.

The definition of a new macro is very similar to the creation of a procedure. The following is the implementation of [clear] as a macro:

 sugar::macro clear argv {
    list set [lindex $argv 1] {{}}
 }

It means: "If you encounter a command called 'clear' inside the source code, call the following procedure putting all the parts of which the command is composed in $argv, and substitute the occurrence of the clear command and arguments, with what the procedure will return."

Again, with other words:

So, what happens is that when a procedure is compiled, for every occurrence of the [clear] command inside the procedure, the above procedure is called, with $argv set to a list that represents the arguments used to call the macro (including the macro name itself as first argument). The result value of the function, that should be a list of the same form, is substituted in place of the original macro call.

To make the example more concrete, see the following code:

 proc foobar {
     set x 10
     clear x
 }

Before to compile the procedure, Tcl will call the macro we defined with $argv set to the two elements list {clear x} (verbatim). That procedure returns

  list set [lindex $argv 1] {{}}

so for the argument {clear x} it will return the list {set x {}}. This return value will be substituted in place of "clear x".

Actually, after the proc was defined, we can use [info body] to check what happened with the macro:

  info body proc

will output

     set x 10
     set x {}

At this point it's possible to use the [clear] macro as it was a Tcl procedure.

But Tcl has [uplevel] and [upvar], so for what macros are useful? Fortuantely they allows for many interesting things not possible at all otherwise, still, this example shows the first big advantage of macros:

1) Macros makes Tcl faster, without to force the user to inline code by hand.

The [clear] command implemented as macro runs 3 times faster in my Tcl 8.4.

Also, being [upvar] one of the biggest obstacles in the ability of the Tcl compiler to optimize Tcl bytecode, it's not impossible that at some point Tcl will be able to run much faster if the user will ensure a given procedure is never target of [upvar].

Simple commands that involve the use of upvar can be even more simple to write implemented as macros. The following are four examples:

 # [first $list] - expands to [lindex $list 0]
 sugar::macro first argv {
    list lindex [lindex $argv 1] 0
 }

 # [rest $list] - expands to [lrange $list 1 end]
 sugar::macro rest argv {
    list lrange [lindex $argv 1] 1 end
 }

 # [last $list] - expands to [lindex $list end]
 sugar::macro last argv {
    list lindex [lindex $argv 1] end
 }

 # [drop $list] - expands to [lrange $list 0 end-1]
 sugar::macro drop argv {
    list lrange [lindex $argv 1] 0 end-1
 }

Sugar supports three types of macros. We are dealing with the simplest and more common macros: command macros.

The other two types, syntax macros, and transformers, will be covered later. For now let's go to create a more complex macro.

A more complex example

Good macros do source code transformation in a smart way, they turn a form that is undestood by the programmer, to one that is also understood by the compiler, that's hard to type and use in raw form without the macro support, but optimal otherwise.

Ideally a macro should expand to a single command call (possibly including many other nested), and should not expand to code that magically creates variables at runtime to store intermediate results all the times it can be avoided (because there may be collisions with variables in the function, or created by other bad macros. Btw, in the TODO list of sugar there is a way to generate unique local variable names).

If the macro is well written, then the programmer can use it like any other command without to care much.

We will see a more real example of macro that implements an very efficient [lpop] operator. It accepts only one argument, the name of a variable, and returns the last element of the list stored inside the given variable. As side effect, [lpop] removes the last element from the list. (it's something like the complementar of lappend).

A pure-Tcl implementation is the following:

 proc lpop listVar {
    upvar 1 $listVar list
    set res [lindex $list end]
    set list [lrange $list 1 end]
    return $res
 }

This version of lpop is really too slow, in fact when [lrange] is called, it creates a new list object even if the original one, stored in the $list variable, is going to be freed and replaced by the copy. To modify the list on-place is far better.

The [lrange] implementation is able to perform this optimization if the object in "not shared" (if you don't know about this stuff try to read the Wiki page about the K operator before to continue)

So it's better to write the proc using the K operator. The lrange line should be changed to this:

    set list [lrange [K $list [set list ""]] 1 end]

With K being:

 proc K {x y} {
     return $x
 }

But even to call K is costly in terms of performace, so why don't inline it also? To do it requires to change the previous lrange line to this:

    set list [lrange [lindex [list $list [set list ""]] 0] 1 end]

That's really a mess to read, but works at different speed, and even more important, at different time complexity!.

Writing a macro for [lpop] we can go even faster, and, at the same time, we can have this code more easy to maintain and read. Actually macros are allowed to expand to commands containing other macros, and so on. This means that we can write a macro for every single step of [lpop]. We need the [first] [last] and [drop] macros already developed, and a macro for K:

 sugar::macro K argv {
    foreach {x y} $argv break
    list first "[list $x $y]"
 }

Note that we used foreach instead of two calls to lindex that's probably faster. But remember that macros don't have to be fast in the *generation* of the expanded code.

This will expand K $x $y into first [list $x $y], that will be expanded in [lindex [list $x $y 0].

We have one last problem. Even after the optimization and the use of [K] inline, the procedure above required a local variable 'res' to save the last argument of the list before to modify it, and use $res later as return value for the procedure. We don't want to create local vars into the code that calls the [lpop] macro, nor we want to expand to more than a single command. The K operator can help us to do so. Instead to use:

    set res [lindex $list end]
    set list [lrange [lindex [list $list [set list ""]] 0] 1 end]
    return $res

why don't just write:

    K [lindex $list end] [set list [lrange [lindex [list $list [set list ""]] 0] 1 end]]

That's ok, but what an unreadable code! Thanks to amcros we can abstract from the fact that to call procedures is slow, so we just write:

    [K [last $list] [set list [rest [K $list [set list ""]]]]]

Will not win the clean-code context this year, but it's much better than the previous. Ok... now we want a macro that, every time we type "lpop $list", will expand in the above line:

 sugar::macro lpop argv {
    set varname [lindex $argv 1]
    set argv [list \
        K \
        {[last $%varname%]} \
        {[set list [drop [K $%varname% [set %varname% ""]]]]} \
    ]
    foreach i {1 2} {
        lset argv $i [string map [list %varname% $varname] [lindex $argv $i]]
    }
    return $argv
 }

There are few things to note about this code. The macro returns a list, where every element is a token of a Tcl command in the source code. This does not mean we have to transform in lists even arguments that happens to represent a script. Also note that the input list of the macro is just a list of tokens that are *exactly* what the user typed they in the source code, verbatim. What follows is that the tokens are already quoted and valid representations of a procedure argument. We don't need to care about the fact that they must be interpreted as a single argument like if we were generating code to pass to eval.

This allows the macro developer to use templates for macros, in fact the [lpop] macro is just using a three argument template, and the final foreach will substitute the arguments that needs to refer to the variable name, with that name. You don't have to care what that variable name is. It can be a complex string formed by more commands, vars, and so on likethis$and-this. If it was a single argument in the source code, it will be in the macro after the expansion.

Another interesting thing to note is that we don't really have to return every token as a different element of the list. In pratice we can return it even as a single-element list. The rule is that the macro expander will care to put an argument separator like a tab, or a space, for every element of the list, and put a command separator like newline or ";" at the end. If we put spaces ourself, we can just return a single element list.

So, the lpop macro can also by written in this way:

 sugar::macro lpop argv {
    set varname [lindex $argv 1]
    set cmd [format {
        K [last $%varname%] [set list [drop [K $%varname% [set %varname% ""]]]]
    } $varname $varname $varname]
    return [list $cmd]
 }

This is much more simple and clean, and actually it's possible to use this style. The difference is that returning every token as a different element of a list makes Sugar macros able to left the indentation of the original code unaltered. This is helpful both to take procedure error's line numbers correct, and to see a good locking output of info body. But as long as most macros are about commands that are just typed in the same line together with all the arguments, for many macros is just a matter of tastes.

If you are implementing control structures that are:

 indented {in} {
    this way
 }

It's another question, and it's better to return every token as a list element.

Number of argument and other static checks in macros

Macros expand to code that will raise an error if the number of arguents is wrong in most cases, but it's possible to add this control inside the macro. Actually it's a big advantage of macros because they are able to signal a bad number of arguments at run time: this can help to write applications that are more reliable. It's even possible to write a macro that expands to exactly what the user typed in, but as side effect does a static check for bad number (or format) of arguments:

 sugar::macro set argv {
    if {[llength $argv] != 3 || [llength $argv] != 2} {
        error "Bad number of arguments for set"
    }
    return $argv
 }

This macro returns $argv itself, so it's an identity transformation, but will raise errors for [set] with a bad number of arguments even for code that will never be reached in the application. Note that the previous macro for set is a bit incomplete: to get it right we should add checks for arguments that starts with {expand}, for this reason Sugar will provide a function to automatically search for a bad number of arguments in some next version.

Note that {expand} introduces for the first time the possibility for a command to get a number of arguments that is non evident reading the source code but computed at runtime. Actually {expand} is an advantage for static checks because before of it the way to go was [eval], that does totally "hide" the called command postponing all the work at run-time. With {expand} it's always possible to say from the source code that a command is called with *at least* N arguments. Still, to add new syntax to Tcl will probably not play well with macros and other form of source code processing.

Identity macros are very powerful to perform static syntax checks, they can not only warn on bad number of arguments, but with the type of this arguments. See for example the following identity macro for "string is":

 proc valid_string_class class {
    set classes {alnum alpha ascii control boolean digit double false graph integer lower print punct space true upper wordchar xdigit}
    set first [string index $class 0]
    if {$first eq {$}} {return 1}
    if {$first eq {[}} {return 1}
    if {[lsearch $classes $class] != -1} {return 1}
    return 0
 }

 sugar::macro string argv {
    if {[lindex $argv 1] eq {is} && [llength $argv] > 2} {
        if {![valid_string_class [lindex $argv 2]]} {
            puts stderr "Warning: invalid string class in procedure [sugar::currentProcName]"
        }
    }
    return $argv
 }

Thanks to this macro it's possible to ensure that errors like to write [string is number] instead [string is integer] are discovered at compile-time. In this respect macros can be seen as a programmable static syntax checker for Tcl. We will see how "syntax macros" are even more useful in this respect. This is the second feature that macros add to Tcl:

2) Macros are a powerful programmable static checker for Tcl scripts.

Actually I think it's worth to use macros even only for this during the development process, and than flush they away.

Conditional compilation

That's small and neat: we can write a simple macro that expands to some code only if a global variable is set to non-zero. Let's write this macro that we call [debug].

 sugar::macro debug argv {
    if {$::debug_mode} {
        list if 1 [lindex $argv 1]
    } else {
        list
    }
 }

Than you can use it in your application like if it was a conditional:

 # Your application ...
 debug {
    set c 0
 }
 while 1 {
    debug {
        incr c
        if {$c > 100} {
            error "Too many iteractions..."
        }
    }
    .... do something ....
 }

All the [debug {someting}] commands are compiled as [if 1 {something}] if the ::debug_mode variable is true. Instead if this var is false, they will not be compiled at all.

That's the simplest example, you can write similar macros like [ifunix], [ifwindows], [ifmac], or even to expand to different procedures call if a given command is called with 2, 3 or 4 arguments. The limit is the immagination.

New control stuctures

Not all the programming languages allow to write new control structures. Tcl is one of this better languages that don't put the programmer inside a jail, but, not all the programming languages that allows to write new control structures, are able to make they efficient.

Tcl macros can make new control sturctures as fast as byte compiled control structures, because user defined ones are usually syntax glue for code transformations. Being macro transformers that translates a from to another, that's a good fit for macros.

That's a macro for the ?: operator.

 # ?: expands
 #   ?: cond val1 val2
 # to
 #   if $cond {format val1} {format val2}
 sugar::macro ?: argv {
    if {[llength $argv] != 4} {
        error "Wrong number of arguments"
    }
    foreach {_ cond val1 val2} $argv break
    list if $cond [list [list format $val1]] [list [list format $val2]]
 }

The macro's comment shows the expansion performed. Being it translated to an if command, it's as fast as a Tcl builtin.

How macros knows what's a script?

I Tcl there are no types, nor special syntaxes for what is code and what is just a string, so you may wonder why macros are not expanded in the following code:

 puts {
    set foo {1 2 3}; [first $foo]
 }

But they are expanded in this:

 while 1 {
    set foo {1 2 3}; [first $foo]
 }

I guess this is one of the main problems developers face designing a macro system for Tcl, and even one of the better helpers of the idea that a good macro system for Tcl is impossible because you can't say what is code and what isn't.

Sugar was designed to address this problem in the simplest possible of the ways: because it can't say if an argument is a script or not, macro expansion is not performed in arguments, so in theory Sugar will not expand the code that's argument to puts, nor while.

But of course, in the real world for a macro system to be usable, macros should be expanded inside the while, and not expanded in puts, so the idea is that for commands you know the argument is a script, you write a macro that returns the same command but with script arguments macro-expanded. It is very simple and in pratice this works well. For example that's the macro for while:

 sugar::macro while argv {
    lset argv 1 [sugar::expandExprToken [lindex $argv 1]]
    lset argv 2 [sugar::expandScriptToken [lindex $argv 2]]
 }

That's the macro for if:

 sugar::macro if argv {
    lappend newargv [lindex $argv 0]
    lappend newargv [sugar::expandExprToken [lindex $argv 1]]
    set argv [lrange $argv 2 end]
    foreach a $argv {
        switch -- $a {
            else - elseif {
                lappend newargv $a
            } 
            default {
                lappend newargv [sugar::expandScriptToken $a]
            }
        }
    }
    return $newargv
 }

As you can see Sugar exports an API to perform expansion in Tcl scripts and Expr expressions. There are similar macros for switch, for, and so on. If you write a new conditional or loop command with macros, you don't need it at all because the macro will translate to code that contains some form of a well known built-in conditional or loop command, and we already have macros for this (remember that macros can return code with macros).

If you write any other command that accept as arguments a Tcl script or expr expression, just write a little macro for it to do macro expansion. This has a nice side effect:

 proc nomacro script {
    uplevel 1 $script
 }

Don't write a macro for nomacro, and you have a ready-to-use command that works as a barrier for macro expansion.

Continue with section 2 - Sugar syntax macros


WHD: This is very cool, but I have to ask--why not allow macros to have a standard Tcl argument list? That is,

 sugar::macro mymacro {args} {...}

Gives the behavior you describe here, while

 sugar::macro mymacro {a b c} {...}

explicitly creates a macro that takes three arguments and will generate a standard error message if you supply some other number?


SS: This can be a good idea, being always possible to use 'args' as only argument to have the current behaviour. I used a single list as input mainly because the same macro can have more then a name, and in order to have the same interface for both command macros and syntax macros. For example:

 sugar::macro {* + - /} argv {
    list expr [list [join [lrange $argv 1 end] " [lindex $argv 0] "]]
 }

Will handle * + - / with the same code. Macros with more than a name may in extreme cases even give different meanings for arguments in the same position. Btw there is 'args' for this case. So I can change the API to something like this:

  sugar::macro {name arg1 arg2 ...} {...}

That's like a Tcl proc, but with the name that was used to call the macro as the first argument. For syntax macros this format actually may not have a lot of sense, but there is still 'args'. I'll include this chage in the next version if I'll not receive feedbacks against it. Thanks for the feedback WHD.

WHD: I think that on the whole I prefer the previous syntax for command macros; the macro can always have an implicit argument that is the macro name. For example,

 # Identity macro
 sugar::macro {+ - * /} {args} { return "$macroname $args" }

SS: For a different question about the sugar API, I wonder if Tclers interested in this macro system feel better the current redefinition of proc, or if it's better to provide a sugar::proc procedure that's exactly like proc but with macro expansion.

If the API will remain the current with proc redefined, I'll add in the wrapper an option -nomacro that will just call the original proc. Please add your name with optional motivation below.

Yes, I think it's better to wrapper the real proc:

  • Put your name here if you are for this solution.

No, I want macro expansion only using sugar::proc:

  • SS (avoid to waste CPU time for procs that don't use macros, this can be a big difference if you package require sugar before Tk or other big packages)

WHD: Since you have to override the standard control structures to make macros work, it seems to me that what you really need is a pair of commands:

 sugar::configure -enabled 1

 # Macros expanded in body
 proc myproc {a b c} {....}

 # Macros expanded in expression and body
 while {$a > $b} {....}

 sugar::configure -enabled 0

 # Macros no long expanded.

SS: Actually Sugar does not override nothing! (so it will expand all at compile time, no run-time overhead). It does expansion inside control structures just using macros for while and so on. In this page this is better explained in the section: How macros knows what's a script?. So to override proc, or to provide a private proc-like command is just a matter of design (or tastes), all will work in both the cases.