Version 31 of taccle

Updated 2004-09-30 15:09:34 by jt

taccle -- Taccle is Another Compiler Compiler

taccle is a complement to fickle in that it reads a taccle specification file to generate working LALR(1) parser. In other words, it is to Tcl what yacc (or bison) is to C/C++. taccle differs from yeti in that the grammar is written before hand as a straight text file rather than generated by procedure calls. taccle is furthermore superior to yeti in that it generates pure Tcl rather than incr tcl and supports both embedded actions and operator precedence. Unlike tyacc [L1 ] taccle is written in pure Tcl 8.4.

taccle spec files are structured nearly identical to those used by yacc. The following example (blantantly stolen from chapter 8 of lex & yacc[L2 ]) may be interpreted equally by the two:

 %token A R

 %%

 start: x | y R;

 x: A R;
 y: A;

Incidentally both yacc and taccle would recognize the shift/reduce conflict above.

A Practical Example

Here is another example. The file has been compacted to make it better fit on the web page.

 %{
 #!/usr/bin/tclsh
 %}
 %token ID
 %start start 
 %%

 start: E        { puts "Result is $1" } ;

 E: E '+' T      { set _ [expr {$1 + $3}] }
    | E '-' T    { set _ [expr {$1 - $3}] }
    | T ;

 T: T '*' F      { set _ [expr {$1 * $3}] }
    | T '/' F    { set _ [expr {$1 / $3}] }
    | F ;

 F: '(' E ')'    { set _ $2 }
    | ID         { set _ $::yylval } ;

 %%
 source simple_scanner.tcl; yyparse

This is, of course, the infamous calculator example. Several new things are presented here:

  • There is a literal block (#!/usr/bin/tclsh) and a user subroutine section (source simple_scanner.tcl; yyparse).
  • The start state is explicitly given; by default taccle (like yacc) uses the first rule as a start state.
  • Associated with most of the rules are actions to take when that rule gets reduced. Here is where taccle deviates from yacc. Whereas yacc uses $$ for synthesized attribute taccle calls the variable $_.
  • The rules without explicitly declared actions use the default one set _ $1. This is analogous to yacc's $$ = $1;.
  • Yacc declares a global variable yylval typically as a C union. Because Tcl does not have unions, or for that matter variable types (for the most part), the %union and %type keywords are unnecessary. This example assumes that the token generator places into ::yylval attributes of the correct type (e.g., ID will not have a non-numeric attribute). Sorry, you'll have to do your own error checking.
  • This example does not make use of [pkg_mkIndex] so it is necessary to source in the token generator. In this case the generator was developed used fickle.

New with taccle version 0.4 are:

  • operator precedence (%left, %right, %nonassoc, and %prec)
  • new command line flags -w and --version
  • fixed error when calculating first and follow sets for certain recursive rules

With taccle version 0.3 are:

  • corrected epsilon transitions; they should work for 99.99% of all cases now
  • embedded (i.e., "mid-rule") actions

taccle version 0.2 has:

  • preliminary epsilon transitions (doesn't work yet for all conditions)
  • error recovery with the error token
  • rename all variables with -p option

There are some things taccle cannot handle. These are on my TODO list:

  • detect infinitely recursive rules (e.g., foo -> foo '42' ;)
  • inherited attributes (synthesized attributes are easy, inherited not so)

Downloads

taccle is protected by the GNU General Public License. You should read the README file before use; a complete set of examples are in the examples subdirectory. Familiarity with the Dragon book as well as lex & yacc would also prove useful.

Download taccle from below:


Comments below:


FPX: Nice. At the time, I wanted to write a supplementary program for yeti to parse a yacc-like input file and produce a parser from that. I never got around to it. I wanted to write the program twice: Once, based on yeti primitives, and second, using itself -- the litmus test for every compiler is to compile itself ;)

27sep04 jcw - The following small change to taccle.tcl produces output files which are easier to examine, by breaking up potentially huge [array get ...] lines:

  ######################################################################
  # handles actually writing parser to output files

  proc write_array {fd name values} {
      puts $fd "array set $name {"
      foreach {x y} $values {
        puts $fd "  [list $x $y]"
      }
      puts $fd "}"
  }

  proc write_dest_file {} {
      puts $::dest "
  ######################################################################
  # autogenerated taccle code below
  "
      write_array $::dest ::${::p}table [array get ::lalr1_parse]
      write_array $::dest ::${::p}rules [array get ::rule_table *l]
      write_array $::dest ::${::p}rules [array get ::rule_table *dc]
      write_array $::dest ::${::p}rules [array get ::rule_table *e]

      puts $::dest "

jt Thanks for the suggestion. I've incorporated your code into version 0.4.


Return to Jason Tang


Category Application

Category Dev. Tools