Version 29 of taccle

Updated 2004-09-29 16:30:07 by jt

taccle -- Taccle is Another Compiler Compiler

taccle is a complement to fickle in that it reads a taccle specification file to generate working LALR(1) parser. In other words, it is to Tcl what yacc (or bison) is to C/C++. taccle differs from yeti in that the grammar is written before hand as a straight text file rather than generated by procedure calls. taccle generates pure Tcl rather than incr tcl and has embedded actions. Unlike tyacc [L1 ] taccle is written in pure Tcl 8.4.

taccle spec files are structured nearly identical to those used by yacc. The following example (blantantly stolen from chapter 8 of lex & yacc[L2 ]) may be interpreted equally by the two:

 %token A R

 %%

 start: x | y R;

 x: A R;
 y: A;

Incidentally both yacc and taccle would recognize the shift/reduce conflict above.

A Practical Example

Here is another example. The file has been compacted to make it better fit on the web page.

 %{
 #!/usr/bin/tclsh
 %}
 %token ID
 %start start 
 %%

 start: E        { puts "Result is $1" } ;

 E: E '+' T      { set _ [expr {$1 + $3}] }
    | E '-' T    { set _ [expr {$1 - $3}] }
    | T ;

 T: T '*' F      { set _ [expr {$1 * $3}] }
    | T '/' F    { set _ [expr {$1 / $3}] }
    | F ;

 F: '(' E ')'    { set _ $2 }
    | ID         { set _ $::yylval } ;

 %%
 source simple_scanner.tcl; yyparse

This is, of course, the infamous calculator example. Several new things are presented here:

  • There is a literal block (#!/usr/bin/tclsh) and a user subroutine section (source simple_scanner.tcl; yyparse).
  • The start state is explicitly given; by default taccle (like yacc) uses the first rule as a start state.
  • Associated with most of the rules are actions to take when that rule gets reduced. Here is where taccle deviates from yacc. Whereas yacc uses $$ for synthesized attribute taccle calls the variable $_.
  • The rules without explicitly declared actions use the default one set _ $1. This is analogous to yacc's $$ = $1;.
  • Yacc declares a global variable yylval typically as a C union. Because Tcl does not have unions, or for that matter variable types (for the most part), the %union and %type keywords are unnecessary. This example assumes that the token generator places into ::yylval attributes of the correct type (e.g., ID will not have a non-numeric attribute). Sorry, you'll have to do your own error checking.
  • This example does not make use of [pkg_mkIndex] so it is necessary to source in the token generator. In this case the generator was developed used fickle.

New with taccle version 0.4 are:

  • operator precedence (%left, %right, %nonassoc, and %prec)
  • new command line flags -w and --version
  • fixed error when calculating first and follow sets for certain recursive rules

With taccle version 0.3 are:

  • corrected epsilon transitions; they should work for 99.99% of all cases now
  • embedded (i.e., "mid-rule") actions

taccle version 0.2 has:

  • preliminary epsilon transitions (doesn't work yet for all conditions)
  • error recovery with the error token
  • rename all variables with -p option

There are some things taccle cannot handle. These are on my TODO list:

  • detect infinitely recursive rules (e.g., foo -> foo '42' ;)
  • inherited attributes (synthesized attributes are easy, inherited not so)

Downloads

taccle is protected by the GNU General Public License. You should read the README file before use; a complete set of examples are in the examples subdirectory. Familiarity with the Dragon book as well as lex & yacc would also prove useful.

Download taccle from below:


Comments below:


FPX: Nice. At the time, I wanted to write a supplementary program for yeti to parse a yacc-like input file and produce a parser from that. I never got around to it. I wanted to write the program twice: Once, based on yeti primitives, and second, using itself -- the litmus test for every compiler is to compile itself ;)

27sep04 jcw - The following small change to taccle.tcl produces output files which are easier to examine, by breaking up potentially huge [array get ...] lines:

  ######################################################################
  # handles actually writing parser to output files

  proc write_array {fd name values} {
      puts $fd "array set $name {"
      foreach {x y} $values {
        puts $fd "  [list $x $y]"
      }
      puts $fd "}"
  }

  proc write_dest_file {} {
      puts $::dest "
  ######################################################################
  # autogenerated taccle code below
  "
      write_array $::dest ::${::p}table [array get ::lalr1_parse]
      write_array $::dest ::${::p}rules [array get ::rule_table *l]
      write_array $::dest ::${::p}rules [array get ::rule_table *dc]
      write_array $::dest ::${::p}rules [array get ::rule_table *e]

      puts $::dest "

jt Thanks for the suggestion. I've incorporated your suggestion into version 0.4.


Return to Jason Tang


Category Application

Category Dev. Tools