Code Generation

Code Generation is a common phenomenon in Tcl scripts. This page describes various related techniques.

See Also

Template And Macro Processing
many examples
expand
the excellent processor by Will Duquette.
TemplaTcl: a Tcl template engine
follows a very similar approach to the g2pp tool presented below.
Critcl
AM 2007-08-29: an application that generates the code to interface to C functions. See Critcl goes Fortran for a similar effort to generate code to interface to/from Fortran routines.
C code generators
ctrans
literate programming

Examples

Brainfuck-to-Tcl transpiler
Generates an equivalent Tcl script.
add_proc
an OpenACS procedure that generates the body of a procedure.
Functional imaging
Stephen Uhler's HTML parser in 10 lines
do...until in Tcl
the tailcall implementation string map to generate a properly-formatted Tcl script.
fileutil::magic::fileType
Reads magic files and generates a Tcl script to recognize the type of content in a file.

Description

The two most important Tcl commands for code generation are string map and list. In general, avoid subst, as it performs string substitution, and spaces and other special values in the substitutions will end up unquoted, which is almost always not what's desired.

Use string map to substitute tokens in a code template, and list to armour each replacement value as a word:

set body [string map [list @token1@ [list $var1]] $body[set body {}]]

When the replacement value is a list of words, each of which should be a separate word in the generated code, don't use list:

set body [string map [list @args@ $args]] $body[set body {}]]

An alternative to string map is to use a literal value along with list for armour:

set body "some_command \$somevar [list $var1] \"some value\" $more_words {another value}"

But as the small example illustrates, backslashes can quickly proliferate.


As discussed in the Tcl Chatroom, 2016-02-11, special care is required to generate a script from a list of commands. Because the last word in a command might be \, and because \ followed by a newline has a special interpretation, simply joining the list by a single \n could have surprising results:

set s1 "set x \\"
set s2 {set y $x}
set s $s1\n$s2
eval $s ;#-> set varName ?newValue?

Therefore, the complete special sequence \\\n should be used for the task:

set s1 "set x \\"
set s2 {set y $x}
set s $s1\\\n$s2
eval $s

Example: C++

Koen Van Damme:

Many C++ developers use tools that generate C++ code. Many of these tools are even developed in-house. Here I would like to collect some ideas for how to use Tcl to assist in generating such code.

Koen Van Damme: For some time now, I've been running around with the idea of using Tcl as a preprocessor language for C++. You know, replace those "#define" and "#if" with Tcl's set and if. Not to mention the incredible power of having foreach, proc and even file I/O available in the preprocessor! Just imagine...

Generating code from Tcl can be as easy as having a few puts statements produce the desired output. But we can go a lot further than that. I have written a paper about some tips and techniques you can use in Turn your scripting language into a code generator .


Avoiding puts.

So, where do we begin? Of course you can generate C++ like this:

foreach animal { Cat Dog Snake } {
    puts "class $animal : public Animal \{"
    puts "public:"
    puts "   // Constructor and destructor"
    puts "   $animal ( ) \{"
    puts "      printf(\"Creating a new \\\"$animal\\\"\\n\");"
    puts "   \}"
    puts "   $animal ( ) \{ \}"
    puts "\};"
}

The first setback of this approach is its general ugliness. Bulky escape sequences and lots of puts statements make it hard to see what the resulting output will look like. The special meaning of backslash, braces and other characters in Tcl forces us to escape them with a preceding backslash.

The solution is to use a small tool that automatically provides the calls to puts and all the required escape characters. I have 2 such tools for you to download at gener8.be (documentation is here ). These tools allow Tcl and other scripting languages to be used as a preprocessor language for any kind of text output, in particular for C/C++ code.

Thanks to the g2pp tool, I can now rewrite the above like this:

foreach animal { Cat Dog Snake } {
    @
    class $(animal) : public Animal {
    public:
        // Constructor and destructor
        $(animal) ( ) {
            printf("Creating a new \"$(animal)\"\n");
        }
        $(animal) ( ) { }
    };
    @
}

(note how the '@' sign switches between "pure Tcl" mode and "generating escaped output" mode). This looks a lot better already, because it looks a lot like the code we intend to generate. The input that we type is very close to the output that we want to produce; it becomes easier to predict what the code generator will do. Apart from the variable substitution such as $(animal), we can write plain old C++ code between the @ characters. Note that we do not have to escape quotes and backslashes anymore.

A more advanced tool, called FrontLine, offers additional features and in particular takes care of proper indentation (which is crucial when using Python). With FrontLine, we can write the above like this:

= tcl
foreach animal {Cat Dog Snake} {
    = cxx
    class $(animal) : public Animal {
    public:
        // Constructor and destructor
        $(animal) ( ) {
            printf("Creating a new \"$(animal)\"\n");
        }
        $(animal) ( ) { }
    };
}

Please refer to the FrontLine page for more information; here we will dive deeper into g2pp.


Buffering the output in a clipboard.

Another setback is that I cannot come back and re-edit part of the code later. Once I do puts, the output is gone. I need some kind of buffer in which I can write an empty class body (already including the closing curly brace) and then come back and fill up the class body at a later point in time.

My Text Clipboard library is a first attempt to provide such a buffering mechanism. It stores text as a tree of nodes, which can be inserted, removed, or changed. Only when you're really, really satisfied with the result do you output the clipboard, all at once.


Imitating C++.

One thing I love about Tcl is its extremely flexible syntax! For example, it is very easy to write some Tcl procedures that look and feel like C++:

# A procedure that can be called like a C++ class declaration
proc class {name body} {
    # Store name of class in global variable
    global classname
    set classname $name
    ...
    uplevel 1 body
    ...
}

# A procedure that looks like C++ comments
proc // args {
    # Ignore args
}

# Two procedures to switch to "public" access mode
proc public args {
    global access
    set access "public"
}

proc public: {} {
   public
}

# etc etc

It is amazing how much you can make Tcl syntax look like the syntax of another language.

RS is a Tcl wizard who has a lot of these language-imitations, including Playing Prolog and an APL playstation!

Combining these C++ lookalike-procs with the clipboards we implemented earlier, we can now instruct our preprocessor to produce C++ code thusly:

foreach animal {Cat Dog Snake} {
    class $animal : public Animal {
    public:
        // The default constructor and destructor
        @
        $(animal) ( ) {
            printf("Creating a new \"$(animal)\"\n");
        }
        $(animal) ( ) { }
        @
    }
}

You see that we call the procs class, public: and //. The output (the entire class body) is sent to a clipboard, so that we can add more methods or data members later.

The details of how these special procedures can be implemented, and how you can use them for code generation, are on my old homepage .


Roles.

Let's say we now implement the following procedure:

proc role_regular args {
    # Remember we stored the current class name in a global variable?
    global classname
    add_to_header $classname {
       @
       $(classname)();
       ~$(classname)();
       $(classname)(const $(classname)& other);
       $(classname)& operator= (const $(classname)& other);
       @
    }
}

It is supposed to be invoked when inside a class body, so that the global variable classname is properly set. It adds the four "regular" methods to the header of the class: default and copy constructor, destructor, and assignment operator. A similar proc can be written to produce a skeleton for the actual implementation for these methods. Note that add_to_header is some procedure that adds new code to the declaration of a class; this is possible thanks to the clipboard mechanism (the class declaration is a clipboard with a plug inside where new code can be added).

By simply adding one call to role_regular in our class, we now get these four methods for free. Obviously, each of these methods has a plug in its implementation, so that we can plug in our own code later (e.g. to add the actual assignment code in the assignment operator's body).

Again, more details (and other examples of very powerful roles for c++) are on my old homepage .

As a closing remark: roles are little pieces of functionality that automatically distribute themselves throughout a class implementation. I owe this idea to Luc De Ceulaer and his team, who developed a lot of the concepts of role-oriented design.


Under construction.

This page, as any page on the Wiki, is permanently under construction. So far I only described a small number of pieces of the code generation puzzle. I have used very little of Tcl's power. The examples above show how you can use foreach to produce repeated output (something that is already quite difficult to achieve with the standard CPP preprocessor), and how you can write Tcl procedures to generate code for classes and roles. But many more things are possible, given the power of Tcl and the simplicity of the G2 preprocessor.

The idea is to end up with an extremely powerful language that expands a small input file with mixed TCL and C++ code, into a much larger "pure C++" file. I hope you can feel how powerful this could be. I also hope that you will come up with additional ideas and snippets of code. Please add your own brainwaves to this page!