taccle

taccle -- Taccle is Another Compiler Compiler

taccle is a complement to fickle in that it reads a taccle specification file to generate working LALR(1) parser. In other words, it is to Tcl what yacc (or bison) is to C/C++. taccle differs from yeti in that the grammar is written before hand as a straight text file rather than generated by procedure calls. taccle is furthermore superior to yeti in that it generates pure Tcl rather than incr tcl and supports both embedded actions and operator precedence. Unlike tyacc [L1 ] taccle is written in pure Tcl 8.4.

taccle spec files are structured nearly identical to those used by yacc. The following example (blantantly stolen from chapter 8 of lex & yacc [L2 ]) may be interpreted equally by the two:

%token A R
 
%%
 
start: x | y R;
  
x: A R;
y: A;

Incidentally both yacc and taccle would recognize the shift/reduce conflict above.

A Practical Example

Here is another example. The file has been compacted to make it better fit on the web page.

%{
#!/usr/bin/tclsh
%}
%token ID
%start start 
%%

start: E        { puts "Result is $1" } ;

E: E '+' T      { set _ [expr {$1 + $3}] }
   | E '-' T    { set _ [expr {$1 - $3}] }
   | T ;

T: T '*' F      { set _ [expr {$1 * $3}] }
   | T '/' F    { set _ [expr {$1 / $3}] }
   | F ;

F: '(' E ')'    { set _ $2 }
   | ID         { set _ $::yylval } ;

%%
source simple_scanner.tcl; yyparse

This is, of course, the infamous calculator example. To users of yacc/bison observe that taccle has $_ instead of $$. Further differences are described in the README file[L3 ].

There are some things taccle cannot handle. These are on my TODO list:

  • inherited attributes (synthesized attributes are easy, inherited not so)

Downloads

taccle is protected by the GNU General Public License. You should read the README file before use; a complete set of examples are in the examples subdirectory. Familiarity with the Dragon book as well as lex & yacc would also prove useful.


Bezoar 2015 Dec13

Links to tar.gz files are dead. However Jason has put his taccle and fickle projects in git hub taccle[L4 ]

Download taccle from below:

ccbbaa 20200928: version 1.1 seems to have added a small bug. The 1st subroutine line after the second %% in the .tac file is not copied to the output. Fix: leave a blank line in the .tac file after the last %% then add subroutines.

version 1.1 added infinite recursion detection. taccle will flag the following rules as invalid:

foo: foo 'x' ;
bar: 'y' baz ;
baz: 'z' bar ; 

taccle version 1.0 is the first official release:

With taccle version 0.4 are:

  • operator precedence (%left, %right, %nonassoc, and %prec)
  • new command line flags -w and --version
  • fixed error when calculating first and follow sets for certain recursive rules

With taccle version 0.3 are:

  • corrected epsilon transitions; they should work for 99.99% of all cases now
  • embedded (i.e., "mid-rule") actions

taccle version 0.2 has:

  • preliminary epsilon transitions (doesn't work yet for all conditions)
  • error recovery with the error token
  • rename all variables with -p option

Another Example

ccbbaa 20200928 adding small fixes (filenames, blank line after 2nd %% in .tac source to make code work again - found broken - improved makefile too).

I was asked to provide a non-calculator example. This one parses some simple English sentences. (Handling all of English is difficult to impossible; that requires a doctorate in natural language parsing. See NLP for a Knowledge Database as an attempt do so.) First is the grammar; save this as english_parser.tac:

%{
source english_scanner.tcl
%}
%token NOUN VERB ADJECTIVE ARTICLE PREPOSITION
%%

sentence:       subject VERB direct_object '.'
                 { puts "subject: $1\nverb: $2\ndirect object: $3" }
         |       subject VERB '.'
                 { puts "subject: $1\nverb: $2" }
         ;

subject:        noun_phrase               { set _ $1 }
         ;

noun_phrase:    ARTICLE ADJECTIVE NOUN    { set _ [list $1 $2 $3] }
         |       ARTICLE NOUN              { set _ [list $1 $2] }
         |       ADJECTIVE NOUN            { set _ [list $1 $2] }
         |       NOUN                      { set _ [list $1] }
         ;

direct_object:  PREPOSITION noun_phrase   { set _ "$1 $2" }
         |       noun_phrase               { set _ $1 }
         ;

%%

yyparse

Next is the scanner, english_scanner.fcl:

%{
source "english_parser.tab.tcl"
%}

%option interactive caseless

%%

bob|john|sue|jane  { set ::yylval $yytext; return $::NOUN }
ball|box|brick|bag { set ::yylval $yytext; return $::NOUN }
red|blue|green|big { set ::yylval $yytext; return $::ADJECTIVE }
a|an|the           { set ::yylval $yytext; return $::ARTICLE }
kick|hit|took|gave { set ::yylval $yytext; return $::VERB }
jumped|leapt|flew  { set ::yylval $yytext; return $::VERB }
over|under|around  { set ::yylval $yytext; return $::PREPOSITION }
\s                 # ignore whitespace
.                  { set ::yylval $yytext; return $yytext }

Here is a Makefile; you'll need to modify the paths to match your system:

ccbbaa 20200928 don't forget to change leading spaces in rule code to TABs when pasting. Modified filenames to work. english_grammar.* -> english_parser.* etc.

TCL=tclsh
FICKLE=~/fickle/fickle.tcl
TACCLE=~/taccle/taccle.tcl

ALL_TCL= english_parser.tcl english_scanner.tcl

all: $(ALL_TCL)
         make test

clean:
         rm $(ALL_TCL)

test: $(ALL_TCL)
        echo "Jane jumped." | tclsh ./english_parser.tcl

%.tcl: %.fcl
         $(TCL) $(FICKLE) $<

%.tcl: %.tac
         $(TCL) $(TACCLE) -d -w $<

And some example runs:

 $ echo "Big Bob jumped over the red ball." | tclsh english_parser.tcl
 subject: Big Bob
 verb: jumped
 direct object: over the red ball
 $ echo "Sue took the green box." | tclsh english_parser.tcl
 subject: Sue
 verb: took
 direct object: the green box
 $ echo "The brick flew under a blue bag." | tclsh english_parser.tcl
 subject: The brick
 verb: flew
 direct object: under the blue bag
 $ echo "Jane jumped." | tclsh english_parser.tcl
 subject: Jane
 verb: jumped

Comments below:


FPX: Nice. At the time, I wanted to write a supplementary program for yeti to parse a yacc-like input file and produce a parser from that. I never got around to it. I wanted to write the program twice: Once, based on yeti primitives, and second, using itself -- the litmus test for every compiler is to compile itself ;)

27sep04 jcw - The following small change to taccle.tcl produces output files which are easier to examine, by breaking up potentially huge [array get ...] lines:

  ######################################################################
  # handles actually writing parser to output files

  proc write_array {fd name values} {
      puts $fd "array set $name {"
      foreach {x y} $values {
        puts $fd "  [list $x $y]"
      }
      puts $fd "}"
  }

  proc write_dest_file {} {
      puts $::dest "
  ######################################################################
  # autogenerated taccle code below
  "
      write_array $::dest ::${::p}table [array get ::lalr1_parse]
      write_array $::dest ::${::p}rules [array get ::rule_table *l]
      write_array $::dest ::${::p}rules [array get ::rule_table *dc]
      write_array $::dest ::${::p}rules [array get ::rule_table *e]

      puts $::dest "

jt Thanks for the suggestion. I've incorporated your code into version 0.4.


Paul Doerwald I was told that my regexps were "excellent" and "worthy of worship by Perl monks" *blush* and thus encouraged to post here. This is a fickle script that takes a fulltext of some sort (like a New York Times article) and looks through it for names of people and companies. There are lots of false positives but that's life, and in the application they didn't matter.

%{
#!/usr/bin/tclsh
%}

%%

[ \t]                   puts -nonewline "$yytext"
((([[:upper:]]')?[[:upper:]]+[&.]?[[:lower:]]*),?[ ]?(and|&)*[ ]+)+(([[:upper:]]')?[[:upper:]]+[[:lower:]]*)+           |
(((([[:upper:]]')?[[:upper:]]+\.?[[:lower:]]*)[ ]+)+(of|the|and|for|&)*[ ]*)+(([[:upper:]]')?[[:upper:]]+[[:lower:]]*)+         |
([[:upper:]]('[[:upper:]])?[[:alpha:].-]+[ ])+([[:upper:]]'?[[:alpha:].-]+)             |
[[:upper:]]('[A-Z])?[[:alpha:].-]+              |
[[:upper:]]('[[:upper:]])?[[:alpha:]-]+         |
[[:upper:]]('[[:upper:]])?[[:alpha:]]+          puts -nonewline "<$yytext>"
[[:alpha:]]+            puts -nonewline "$yytext"
.|\n                    puts -nonewline "$yytext"

%%

yylex

Return to Jason Tang