'''Code Generation''' is a common phenomenon in [Tcl] [script%|%scripts]. This page describes various related techniques. ** See Also ** [Template And Macro Processing]: many examples [expand]: the excellent processor by [Will Duquette]. [TemplaTcl: a Tcl template engine]: follows a very similar approach to the g2pp tool presented below. [Critcl]: [AM] 2007-08-29: an application that generates the code to interface to [C] functions. See [Critcl goes Fortran] for a similar effort to generate code to interface to/from Fortran routines. [C code generators]: [ctrans]: [literate programming]: ** Examples ** [http://openacs.org/api-doc/proc-view?proc=ad_proc&source_p=1&version_id=%|%add_proc]: an [OpenACS] procedure that generates the body of a [proc%|%procedure]. [Functional imaging]: [Stephen Uhler's HTML parser in 10 lines]: [do...until in Tcl]: the `tailcall` implementation `[string map]` to generate a properly-formatted Tcl script. ** Description ** The two most important Tcl commands for code generation are `[string map]` and `[list]`. In general, avoid `[subst]`, as it performs string substitution, and spaces and other special values in the substitutions will end up unquoted, which is almost always not what's desired. Use `[string map]` to substitute tokens in a code template, and `[list]` to armour each replacement value as a word: ====== set body [string map [list @token1@ [list $var1]] $body[set body {}]] ====== When the replacement value is a list of words, each of which should be a separate word in the generated code, don't use `[list]`: ====== set body [string map [list @args@ $args]] $body[set body {}]] ====== An alternative to `[string map]` is to use a literal value along with `[list]` for armour: ====== set body "some_command \$somevar [list $var1] \"some value\" $more_words {another value}" ====== But as the small example illustrates, backslashes can quickly proliferate. ---- [Koen Van Damme]: Many [C++] developers use tools that generate C++ code. Many of these tools are even developed in-house. Here I would like to collect some ideas for how to use Tcl to assist in generating such code. [Koen Van Damme]: For some time now, I've been running around with the idea of '''using Tcl as a preprocessor language for C++'''. You know, replace those "#define" and "#if" with Tcl's `[set]` and `[if]`. Not to mention the incredible power of having `[foreach]`, `[proc]` and even [channel%|%file I/O] available in the preprocessor! Just imagine... Generating code from Tcl can be as easy as having a few `puts` statements produce the desired output. But we can go a lot further than that. I have written a paper about some tips and techniques you can use in [http://www.gener8.be/site/articles/code_generation/code_generation.html%|%Turn your scripting language into a code generator]. ------- '''Avoiding `[puts]`.''' So, where do we begin? Of course you can generate C++ like this: ====== foreach animal { Cat Dog Snake } { puts "class $animal : public Animal \{" puts "public:" puts " // Constructor and destructor" puts " $animal ( ) \{" puts " printf(\"Creating a new \\\"$animal\\\"\\n\");" puts " \}" puts " $animal ( ) \{ \}" puts "\};" } ====== The first setback of this approach is its general ugliness. Bulky escape sequences and lots of `[puts]` statements make it hard to see what the resulting output will look like. The special meaning of backslash, braces and other characters in Tcl forces us to escape them with a preceding backslash. The solution is to use a small tool that automatically provides the calls to `[puts]` and all the required escape characters. I have 2 such tools for you to download at [http://www.gener8.be/site/downloads/index.html%|%gener8.be] (documentation is [http://www.gener8.be/site/articles/code_generation/code_generation.html%|%here]). These tools allow Tcl and other scripting languages to be used as a preprocessor language for any kind of text output, in particular for [C]/[C++] code. Thanks to the g2pp tool, I can now rewrite the above like this: ====== foreach animal { Cat Dog Snake } { @ class $(animal) : public Animal { public: // Constructor and destructor $(animal) ( ) { printf("Creating a new \"$(animal)\"\n"); } $(animal) ( ) { } }; @ } ====== (note how the '@' sign switches between "pure Tcl" mode and "generating escaped output" mode). This looks a lot better already, because ''it looks a lot like the code we intend to generate''. The input that we type is very close to the output that we want to produce; it becomes easier to predict what the code generator will do. Apart from the variable substitution such as `$(animal)`, we can write plain old C++ code between the `@` characters. Note that we do not have to escape quotes and backslashes anymore. A more advanced tool, called [FrontLine], offers additional features and in particular takes care of proper indentation (which is crucial when using Python). With FrontLine, we can write the above like this: ====== = tcl foreach animal {Cat Dog Snake} { = cxx class $(animal) : public Animal { public: // Constructor and destructor $(animal) ( ) { printf("Creating a new \"$(animal)\"\n"); } $(animal) ( ) { } }; } ====== Please refer to the [FrontLine] page for more information; here we will dive deeper into g2pp. ------- '''Buffering the output in a clipboard.''' Another setback is that I cannot come back and re-edit part of the code later. Once I do `puts`, the output is gone. I need some kind of buffer in which I can write an empty class body (already including the closing curly brace) and then come back and fill up the class body at a later point in time. My [Text Clipboard] library is a first attempt to provide such a buffering mechanism. It stores text as a tree of nodes, which can be inserted, removed, or changed. Only when you're really, really satisfied with the result do you output the clipboard, all at once. ------- '''Imitating C++.''' One thing I ''love'' about Tcl is its extremely flexible syntax! For example, it is very easy to write some Tcl procedures that look and feel like C++: ====== # A procedure that can be called like a C++ class declaration proc class {name body} { # Store name of class in global variable global classname set classname $name ... uplevel 1 body ... } # A procedure that looks like C++ comments proc // args { # Ignore args } # Two procedures to switch to "public" access mode proc public args { global access set access "public" } proc public: {} { public } # etc etc ====== It is amazing how much you can make Tcl syntax look like the syntax of another language. [RS] is a Tcl wizard who has a ''lot'' of these language-imitations, including [Playing Prolog] and [an APL playstation]! Combining these C++ lookalike-procs with the clipboards we implemented earlier, we can now instruct our preprocessor to produce C++ code thusly: ====== foreach animal {Cat Dog Snake} { class $animal : public Animal { public: // The default constructor and destructor @ $(animal) ( ) { printf("Creating a new \"$(animal)\"\n"); } $(animal) ( ) { } @ } } ====== You see that we call the procs `class`, `public:` and `//`. The output (the entire class body) is sent to a clipboard, so that we can add more methods or data members later. The details of how these special procedures can be implemented, and how you can use them for code generation, are on my [http://users.pandora.be/koen.vandamme1/c_tools/g2/g2.html%|%old homepage]. ------- '''Roles.''' Let's say we now implement the following procedure: ====== proc role_regular args { # Remember we stored the current class name in a global variable? global classname add_to_header $classname { @ $(classname)(); ~$(classname)(); $(classname)(const $(classname)& other); $(classname)& operator= (const $(classname)& other); @ } } ====== It is supposed to be invoked when inside a class body, so that the global variable ''classname'' is properly set. It adds the four "regular" methods to the header of the class: default and copy constructor, destructor, and assignment operator. A similar proc can be written to produce a skeleton for the actual implementation for these methods. Note that ''add_to_header'' is some procedure that adds new code to the declaration of a class; this is possible thanks to the clipboard mechanism (the class declaration is a clipboard with a plug inside where new code can be added). By simply adding one call to ''role_regular'' in our class, we now get these four methods for free. Obviously, each of these methods has a plug in its implementation, so that we can plug in our own code later (e.g. to add the actual assignment code in the assignment operator's body). Again, more details (and other examples of ''very'' powerful roles for c++) are on my [http://users.pandora.be/koen.vandamme1/c_tools/g2/g2.html%|%old homepage]. As a closing remark: roles are little pieces of functionality that automatically distribute themselves throughout a class implementation. I owe this idea to '''Luc De Ceulaer''' and his team, who developed a lot of the concepts of role-oriented design. ------- '''Under construction.''' This page, as any page on the Wiki, is permanently under construction. So far I only described a small number of pieces of the code generation puzzle. I have used very little of Tcl's power. The examples above show how you can use `[foreach]` to produce repeated output (something that is already quite difficult to achieve with the standard CPP preprocessor), and how you can write Tcl procedures to generate code for classes and roles. But ''many more things are possible, given the power of Tcl'' and the simplicity of the G2 preprocessor. The idea is to end up with an extremely powerful language that expands a small input file with mixed TCL and C++ code, into a much larger "pure C++" file. I hope you can feel how powerful this could be. I also hope that you will come up with additional ideas and snippets of code. Please add your own brainwaves to this page! <> Dev. Tools