Sugar command macros

Difference between version 25 and 26 - Previous - Next
*** NOTE ***
 The API changed in two ways, you should be aware of this if you want to try with the current release of
 sugar what is shown in this tutorial.

 First Change - Now macros are expanded only inside procedures defined with ::sugar::proc
 Second Change - Now macros get a list of arguments like normal procedures, but the first argument is
                 the macro name itself. All the macros in this tutorial will work substituting the 'argv'
                 argument with 'args'.

    * Section 0 - '''[Sugar]'''
    * Section 1 - '''[Sugar command macros]''' (what you are reading)
    * Section 2 - '''[Sugar syntax macros]'''
    * Section 3 - '''[Sugar transformers]'''

'''What is a Tcl macro'''.

A macro is an operator implemented by transformation. Macros are procedures
that generate a Tcl program at compile time, and substitute it where
the programmer used their name as command.

It's a simple concept if explained by examples.

Suppose you want to write a `clear` command that sets the ''varName'' to a null string. It could be Implemented using `[upvar]`, like this:

======
proc clear varName {
    upvar 1 $varName var
    set var {}
}
======

Every time the user type

======
clear myvar
======

When `clear` is called, the `$myvar` of the caller is set to
to the [empty string].

As an alternative to call a procedure that is able to alter the caller's
execution environment, we may want to automatically substitute every occurrence
of the command `clear varname` with `set varname {}`.

So basically we want that when we write

======
clear myvar
======

in a program, it is substitute with

======
set myvar {}
======

as if the programmer had really typed `set myvar {}` instead of `clear myvar`.
That's the goal of the simplest form of a [Sugar]'s macro.

The definition of a new macro is very similar to the creation of a procedure.
The following is the implementation of [[clear]] as a macro:

======
sugar::macro clear argv {
    list set [lindex $argv 1] {{}}
}
======

It means: "If you encounter a command called `clear` inside the source code,
call the following procedure putting all the parts of which the
command is composed in `$argv`, and substitute the occurrence of the clear
command and arguments, with what the procedure will return."

Again, with other words:

So, what happens is that when a procedure is compiled, for every
occurrence of the `clear` command inside the procedure, the above
procedure is called, with `$argv` set to a list that represents the arguments
used to call the macro (including the macro name itself as first argument).
The result value of the function, that should be a list of the same form, is
substituted in place of the original macro call.

To make the example more concrete, see the following code:

======
proc foobar {
    set x 10
    clear x
}
======

'''Before compiling the procedure''', Tcl will call the `clear` procedure 
with `$argv` set to `clear x`.
That procedure returns `set x {{}}`,
This return value will be substituted in place of "clear x".

After the proc was defined, we can use `[info body]`
to see what happened:

======
info body proc
======

will output

======
set x 10
set x {}
======

`[Sugar]` makes it possible to use a macro like `clear` as if it it was a Tcl
procedure, and the macro is called at compile time to produce the procedure
that replaces it.

But Tcl has `[uplevel]` and `[upvar]`, so what are macros
useful for? Fortunately they allow for many interesting things
not possible at all otherwise.  The following example
shows the first big advantage of macros:

'''1) Macros makes Tcl faster, without forcing the user to inline code
by hand.'''

When `clear` is implemented as macro, it runs 3 times faster in my Tcl 8.4.

Also, `[upvar]` is one of the biggest obstacles to the ability
of the Tcl compiler to optimize Tcl [bytecode], it's not impossible
that at some point Tcl will be able to run much faster
if the user will ensure a given procedure is never the target of `[upvar]`.

Simple commands that involve the use of `[upvar]` can be even
more simple to write as macros. The following are four examples:

======
# [first $list] - expands to [lindex $list 0]
sugar::macro first argv {
    list lindex [lindex $argv 1] 0
}

# [rest $list] - expands to [lrange $list 1 end]
sugar::macro rest argv {
    list lrange [lindex $argv 1] 1 end
}

# [last $list] - expands to [lindex $list end]
sugar::macro last argv {
    list lindex [lindex $argv 1] end
}

# [drop $list] - expands to [lrange $list 0 end-1]
sugar::macro drop argv {
    list lrange [lindex $argv 1] 0 end-1
}
======

[Sugar] supports three types of macros. We are dealing with
the simplest and more common macros: command macros.

The other two types, syntax macros, and transformers,
will be covered later. For now let's go to create
a more complex macro.

'''A more complex example'''

Good macros do source code transformation in a smart way,
they turn a form that is undestood by the
programmer into code that is also understood by the compiler,
that's hard to type and use in raw form without the
macro support, but optimal otherwise.

Ideally a macro should expand to a single command call
(possibly including many other nested), and should not
expand to code that magically creates variables at
runtime to store intermediate results all the times it
can be avoided (because there may be collisions with
variables in the function, or created by other bad macros.
Btw, in the TODO list of sugar there is a way to generate
unique local variable names).

If the macro is well written, then the programmer can use it like
any other command without to care much.

We will see a more real example of macro that implements
a very efficient `[lpop]` operator. It accepts only one
argument, the name of a variable, and returns the last
element of the list stored inside the given variable.
As side effect, `lpop` removes the last element from the list.
(it's something like the complement of `[lappend]`).

A pure-Tcl implementation is the following:

======
proc lpop listVar {
    upvar 1 $listVar list
    set res [lindex $list end]
    set list [lrange $list 1 end]
    return $res
}
======

This version of `lpop` is really too slow. In fact when
`[lrange]` is called, it creates a new list object even
if the original one stored in the `$list` variable is going
to be freed and replaced by the copy. To modify the list
in-place is far better.

The `[lrange]` implementation is able to perform this
optimization if the object in "not shared" (if you don't
know about this stuff try to read the Wiki page about the
`[K]` operator before to continue)

So it's better to write the proc using the `[K]` operator.
The lrange line should be changed to this:

======
set list [lrange [K $list [set list {}]] 1 end]
======

With `K` being:

======
proc K {x y} {
    return $x
}
======

But even to call `[K]` is costly in terms of performace, so
why not inline it also? Doing it requires changing
the previous lrange line to this:

======
set list [lrange [lindex [list $list [set list {}]] 0] 1 end]
======

That's really a mess to read, but works at a different speed, and
even more important, at a different time complexity!.

With a macro for `lpop`, we can go even faster, and the code is easier to
maintain and
read. Macros are allowed to expand to commands containing
other macros, recursively. This means that we can write a macro
for every single step of `lpop`. We need the `first`
`last` and `drop` macros already developed, and a macro for `[K]`:

======
sugar::macro K argv {
    foreach {x y} $argv break
    list first [list $x $y]
}
======

Note that for speed, we used `[foreach]` instead of two calls to `[lindex]`.
But remember that macros '''don't have to be fast in the generation''' of the
expanded code.

`K $x $y` expands to `first [[list $x $y]]`, which expands to `lindex [[list $x $y]] 0`.

We have one last problem. Even after the optimization and the
use of `K` inline, the procedure above required a local
variable 'res' to save the last argument of the list before
to modify it, and use `$res` later as return value for the procedure.
We don't want to create local vars into the code that calls
the `lpop` macro, nor do we want to expand to more than a single
command. The `[K]` operator can help us to do so:

======
set res [lindex $list end]
set list [lrange [lindex [list $list [set list {}]] 0] 1 end]
return $res
======

leading to:

======
K [lindex $list end] [set list [lrange [lindex [list $list [set list {}]] 0] 1 end]]
======

That's ok, but what an unreadable code! Thanks to macros
we can abstract from the fact that to call procedures is
slow, so we just write:

======
[K [last $list] [set list [rest [K $list [set list {}]]]]]
======

Will not win the clean-code context this year, but it's much
better than the previous. Ok... now we want a macro that, every time we
type `lpop $list`, will expand in the above line:

======
sugar::macro lpop argv {
    set varname [lindex $argv 1]
    set argv [list K \
        {[last $%varname%]} \
        {[set list [drop [K $%varname% [set %varname% {}]]]]}
    ]
    foreach i {1 2} {
        lset argv $i [string map [list %varname% $varname] [lindex $argv $i]]
    }
    return $argv
}
======

There are few things to note about this code. The macro returns
a list, where every element is a token of a Tcl command
in the source code. This does not mean we have to transform in
lists even arguments that happens to represent a script. Also
note that the input list of the macro is just a list of tokens
that are *exactly* what the user typed they in the source code, verbatim.
What follows is that the tokens are already quoted and valid
representations of a procedure argument.
We don't need to care about the fact that they must be interpreted
as a single argument as we sould when generating code for `[eval]`.

This allows the macro developer to use templates for macros, in
fact the `lpop` macro is just using a three argument template,
and the final foreach will substitute the arguments that needs
to refer to the variable name, with that name. You don't have
to care what that variable name is. It can be a complex string
formed by more commands, vars, and so on `[[like]][[this]]$and-this`.
If it was a single argument in the source code, it will be in the macro
after the expansion.

Another interesting thing to note is that we don't really have
to return every token as a different element of the list. In pratice
we can return it even as a single-element list.
The rule is that the macro expander will care to put an argument
separator like a tab, or a space, for every element of the
list, and put a command separator like newline or `;` at the end.
If we put spaces ourself, we can just return a single element list.

So, the `lpop` macro can also by written in this way:

======
sugar::macro lpop argv {
    set varname [lindex $argv 1]
    set cmd [format {
        K [last $%varname%] [set list [drop [K $%varname% [set %varname% {}]]]]
    } $varname $varname $varname]
    return [list $cmd]
}
======

This is much more simple and clean, and actually it's possible to
use this style. The difference is that returning every token as
a different element of a list makes `[Sugar]` macros able to
left the indentation of the original code unaltered. This is helpful
both to take procedure error's line numbers correct, and to
see a good-looking output of `[info body]`. But as long as
most macros are about commands that are just typed in the same line
together with all the arguments, for many macros is just a matter of tastes.

If you are implementing control structures that are:

======
indented in {
    this way
}
======

It's another question, and it's better to return every token as a
list element.

'''Number of argument and other static checks in macros'''

Macros expand to code that will raise an error if the number of
arguents is wrong in most cases, but it's possible to add this
control inside the macro. Actually it's a big advantage of macros
because they are able to signal a bad number of arguments at
run time: this can help to write applications that are more reliable.
It's even possible to write a macro that expands to exactly what
the user typed in, but as side effect does a static check for
bad number (or format) of arguments:

======
sugar::macro set argv {
   if {[llength $argv] != 3 || [llength $argv] != 2} {
       error "Bad number of arguments for set"
   }
   return $argv
}
======

This macro returns `$argv` itself, so it's an identity transformation,
but will raise errors for `[set]` with a bad number of
arguments even for code that will never be reached
in the application. Note that the previous macro for set is a bit
incomplete: to get it right we should add checks for arguments
that starts with `[{*}]`, for this reason `[Sugar]` will provide a function
to automatically search for a bad number of arguments in some
next version.

Note that `[{*}]` introduces for the first time the possibility for
a command to get a number of arguments that is non evident reading
the source code but computed at runtime. Actually, `{*}` is an
advantage for static checks because prior to it, the way to
go was `[eval]`, that does totally "hide" the called command postponing
all the work at run-time. With `{*}` it's always possible
to say from the source code that a command is called with *at least* N
arguments. Still, to add new syntax to Tcl will probably not play
well with macros and other form of source code processing.

Identity macros are very powerful to perform static syntax checks,
they can not only warn on bad number of arguments, but with the
type of this arguments. See for example the following identity
macro for "string is":

======
proc valid_string_class class {
    set classes {alnum alpha ascii control boolean digit double false graph integer lower print punct space true upper wordchar xdigit}
    set first [string index $class 0]
    if {$first eq {$}} {return 1}
    if {$first eq {[}} {return 1}
    if {[lsearch $classes $class] != -1} {return 1}
    return 0
}

sugar::macro string argv {
    if {[lindex $argv 1] eq {is} && [llength $argv] > 2} {
        if {![valid_string_class [lindex $argv 2]]} {
            puts stderr "Warning: invalid string class in procedure [sugar::currentProcName]"
        }
    }
    return $argv
}
======

Thanks to this macro it's possible to ensure that errors that like to write
`string is number` instead `[string is] integer` are discovered at
compile-time. In this respect macros can be seen as a programmable
static syntax checker for Tcl. We will see how "syntax macros" are
even more useful in this respect. This is the second feature that
macros add to Tcl:

'''2) Macros are a powerful programmable static checker for Tcl scripts.'''

Actually I think it's worth to use macros even only for this during
the development process, and than flush they away.

'''Conditional compilation'''

That's small and neat: we can write a simple macro that expands to
some code only if a global variable is set to non-zero. Let's
write this macro that we call [[debug]].

======
sugar::macro debug argv {
   if {$::debug_mode} {
       list if 1 [lindex $argv 1]
   } else {
       list
   }
}
======

Than you can use it in your application like if it was a conditional:

======
# Your application ...
debug {
    set c 0
}
while 1 {
    debug {
        incr c
        if {$c > 100} {
            error "Too many iteractions..."
        }
    }
    .... do something ....
}
======

if the value of `$::debug_mode` is true, all the `debug {someting}` commands
are compiled as `if 1 {something}`. Otherwise, they will not be compiled at
all.

That's the simplest example, you can write similar macros like `ifunix`,
`ifwindows`, `ifmac`, or even to expand to different procedures call
if a given command is called with 2, 3 or 4 arguments. The limit is
the immagination.

'''New control stuctures'''

Not all the programming languages allow to write new control structures.
Tcl is one of the better languages that don't put the programmer
inside a jail, but not all the programming languages that allow
to write new control structures are able to make them efficient.

Tcl macros can make new control structures as fast as [bytecode%|%byte-compiled]
control structures, because user defined ones are usually syntax glue
for code transformations. Being macro transformers
that translates a from to another, that's a good fit for macros.

Here is a macro for the `?:` operator.

======
# ?: expands
#   ?: cond val1 val2
# to
#   if $cond {format val1} {format val2}
sugar::macro ?: argv {
    if {[llength $argv] != 4} {
        error "Wrong number of arguments"
    }
    foreach {_ cond val1 val2} $argv break
    list if $cond [list [list format $val1]] [list [list format $val2]]
}
======

The macro's comment shows the expansion performed.
Being it translated to an `[if]` command, it's as fast as a
Tcl builtin.

'''How macros knows what's a script?'''

In Tcl there are no types, nor special syntaxes for what is code
and what is just a string, so you may wonder why macros are
not expanded in the following code:

======
puts {
    set foo {1 2 3}; [first $foo]
}
======

But they are expanded in this:

======
while 1 {
    set foo {1 2 3}; [first $foo]
}
======

I guess this is one of the main problems developers face designing a
macro system for Tcl, and even one of the better helpers of the idea
that a good macro system for Tcl is impossible because you can't
say what is code and what isn't.

[Sugar] was designed to address this problem in the simplest possible
of the ways: because it can't say if an argument is a script or not,
macro expansion is not performed in arguments, so in theory [Sugar]
will not expand the code that's argument to `[puts]`, nor `[while]`.

But of course, in the real world for a macro system to be usable,
macros should be expanded inside the `[while]`, and not expanded in `[puts]`,
so the idea is that for commands that you know accept a script as an argument,
you write a macro that returns the same command but with
script arguments macro-expanded. It is very simple and in pratice
this works well. For example that's the macro for `[while]`:

======
sugar::macro while argv {
    lset argv 1 [sugar::expandExprToken [lindex $argv 1]]
    lset argv 2 [sugar::expandScriptToken [lindex $argv 2]]
}
======

That's the macro for [if]:

======
sugar::macro if argv {
    lappend newargv [lindex $argv 0]
    lappend newargv [sugar::expandExprToken [lindex $argv 1]]
    set argv [lrange $argv 2 end]
    foreach a $argv {
        switch -- $a {
            else - elseif {
                lappend newargv $a
            } 
            default {
                lappend newargv [sugar::expandScriptToken $a]
            }
        }
    }
    return $newargv
}
======

As you can see, [Sugar] exports an API to perform expansion in
Tcl scripts and Expr expressions. There are similar macros for
`[switch]`, `[for]`, and so on. If you write a new conditional or
loop command with macros, you don't need it at all because
the macro will translate to code that contains some form of
a well known built-in conditional or loop command, and we already
have macros for this (remember that macros can return code with
macros).

If you write any other command that accept as arguments a Tcl script
or expr expression, just write a little macro for it to do
macro expansion. This has a nice side effect:

======
proc nomacro script {
    uplevel 1 $script
}
======

Don't write a macro for nomacro, and you have a ready-to-use
command that works as a barrier for macro expansion.

Continue with section 2 - '''[Sugar syntax macros]'''

----

[WHD]: This is very cool, but I have to ask--why not allow macros to
have a standard Tcl argument list?  That is, 

======
sugar::macro mymacro args {...}
======

Gives the behavior you describe here, while

======
sugar::macro mymacro {a b c} {...}
======

explicitly creates a macro that takes three arguments and will
generate a standard error message if you supply some other number?

----

[SS]: This can be a good idea, being always possible to
use `args` as only argument to have the current behaviour. I used
a single list as input mainly because the same macro can have more
then a name, and in order to have the same interface for both
command macros and syntax macros. For example:

======
sugar::macro {* + - /} argv {
    list expr [list [join [lrange $argv 1 end] " [lindex $argv 0] "]]
}
======

Will handle `* + - /` with the same code. Macros with more than a name
may in extreme cases even give different meanings for arguments in
the same position. Btw there is 'args' for this case. So I can change
the API to something like this:

======
sugar::macro {name arg1 arg2 ...} {...}
======

That's like a Tcl proc, but with the name that was used to call the macro
as the first argument. For syntax macros, this format actually may not make a
lot of sense, but there is still `args`. I'll include this change in the
next version if I'll not receive feedbacks against it. Thanks for the
feedback WHD.

[WHD]: I think that on the whole I prefer the previous syntax for command
macros; the macro can always have an implicit argument that is the
macro name.  For example,

======
# Identity macro
sugar::macro {+ - * /} {args} { return "$macroname $args" }
======

[JMN]: I'd just like to add my vote for removing the macroname as first argument syntax.
From my hacking about, it seems easy to make it implicitly available more or less as WHD suggests.
(I don't *think* I broke anything.. )

----

[SS]: For a different question about the sugar API, I wonder if Tclers
interested in this macro system feel better about the current redefinition
of `[proc]`, or if it's better to provide a `sugar::proc` that's
exactly like `[proc]` but with macro expansion.

If the API will remain the current with `[proc]` redefined, I'll add
in the wrapper an option `-nomacro` that will just call the original command.
Please add your name with optional motivation below.

Yes, I think it's better to wrapper the real `[proc]`:
    * Put your name here if you are for this solution.

No, I want macro expansion only using `sugar::proc`:
    * [SS] (avoid to waste CPU time for procs that don't use macros, this can be a big difference if you `[package require]` sugar before Tk or other big packages)
   * [DKF]: Avoiding overriding the real [proc] allows packages to use sugar if they want without surprising packages that don't expect it.  Packages that do want it can just do [[[namespace import] ::sugar::proc]] into their own private workspace.

[WHD]: Since you have to override the standard control structures to make macros work, it seems to me that what you really need is a pair of commands:

======
sugar::configure -enabled 1

# Macros expanded in body
proc myproc {a b c} {....}

# Macros expanded in expression and body
while {$a > $b} {....}

sugar::configure -enabled 0

# Macros no long expanded.
======

[SS]: Actually [Sugar] overrides nothing! (so it will expand all at compile time, no run-time overhead).
It does expansion inside control structures just using macros for `[while]` and so on.
In this page this is better explained in the section: '''How a macro knows what's a script?'''. So to override `[proc]`, or to provide a private
proc-like command is just a matter of design (or tastes), all will work in both the cases.

----

[alpha_tcler] 2016-04-06:
The library should allow easily to choose if I want macros inside ::sugar::proc or used elsewhere . 
Secondly, the examples should be corrected ( the args replacing argv). 
Great work , macros make TCL a first class LISP equivalent language.

[PYK] 2016-04-06:  Now other [Lisp] languages just need to grow `[uplevel]`,
`[upvar]`, [coroutine%|%coroutines], explicit [tailcall%|%tailcalls], and
[interp%|%safe interpreters], in order to become first-class Tcl-equivalent
languages :p If you really want to go down the macro rabbit hole, check out
[procstep].

<<categories>> Dev. Tools | Sugar
**your heading1**