PEG Example

DKF: The PEG system (see also Parser Tools) is sophisticated, but difficult to use because of the lack of concrete examples. So I've written one!

Setting things up

package require Tcl 8.6
package require pt::pgen

Note that this code requires Tcl 8.6 mainly because it uses try; porting to 8.5 is entirely possible, but is more work.

ak Note that tcllib contains a package 'try' which is a backport of Tcl 8.6's try to 8.5.

DKF: And if you do that, you'll also want to use the TclOO package too.

Defining the parser

This grammar is a cut-down version of what expr supports.

set grammar {
PEG Calculator (Expression)
    Expression  <- Term (' '* AddOp ' '* Term)*                 ;
    Term        <- Factor (' '* MulOp ' '* Factor)*             ;
    Fragment    <- '(' ' '* Expression ' '*  ')' / Number / Var ;
    Factor      <- Fragment (' '* PowOp ' '* Fragment)*         ;
    Number      <- Sign? Digit+                                 ;
    Var         <- '$' ( 'x'/'y'/'z' )                          ;

    Digit       <- '0'/'1'/'2'/'3'/'4'/'5'/'6'/'7'/'8'/'9'      ;
    Sign        <- '-' / '+'                                    ;
    MulOp       <- '*' / '/'                                    ;
    AddOp       <- '+' / '-'                                    ;
    PowOp       <- '**'                                         ;
END;
}

# Instantiate the parser class
catch [pt::pgen peg $grammar snit -class Calculator -name Grammar]

Note that you currently need to use the Snit-based compiler; the TclOO-based one has some minor-but-irritating bugs, especially when you don't save the generated parser to a file as an intermediate step (something that the pgen system effectively recommends). APN Not sure when the prior comment was made and what irritating bugs are referred to, but as of 2016, I've been using the TclOO based PEG parser in XTAL with no issues.


tomas 2014-01-04 Thanks for the nice example. Alas -- on my distro (Debian GNU/Linux) this dies a painful death, since pt::pgen needs TclOO 0.6.1, but we have 1.0 (yes, we *have* a newer version!).

This is unfortunate, and reminds me of what I call the Java Disease: a too closely-knit net of interdependencies. The next escalation step is Continuous Integration with Jenkins and all that. I'll bring up the dependency problem with my distro, but I guess the problem is more fundamental.

OK -- tracked that back to a wrong tcllib version (had tcllib 1.14, need 1.15)


Defining the compiler

A raw AST is really quite hard to work with. Let's define some code to compile it into something executable, a Tcl script that will evaluate the expression.

oo::class create CompileAST {
    variable sourcecode opns
    constructor {semantics} {
        set opns $semantics
    }
    method compile {script} {
        # Instantiate the parser
        set c [Calculator]
        set sourcecode $script
        try {
            return [my {*}[$c parset $script]]
        } finally {
            $c destroy
        }
    }

    method Expression-Empty args {}
    method Expression-Compound {from to args} {
        foreach {o p} [list Expression-Empty {*}$args] {
            set o [my {*}$o]; set p [my {*}$p]
            set v [expr {$o ne "" ? "$o \[$v\] \[$p\]" : $p}]
        }
        return $v
    }
    forward Expression   my Expression-Compound
    forward Term         my Expression-Compound
    forward Factor       my Expression-Compound
    forward Fragment     my Expression-Compound

    method Expression-Operator {from to args} {
        list $opns [string range $sourcecode $from $to]
    }
    forward AddOp        my Expression-Operator
    forward MulOp        my Expression-Operator
    forward PowOp        my Expression-Operator

    method Number {from to args} {
        list $opns value [string range $sourcecode $from $to]
    }

    method Var {from to args} {
        list $opns variable [string range $sourcecode [expr {$from+1}] $to]
    }
}

Binding to an evaluation strategy

The compiled code is no use without having something to actually provide the semantics; the compiler itself only generates the nested call structure and tidies up the terminals a bit, it doesn't understand what it is generating. So here's an evaluation engine that uses modular arithmetic and which binds variables to global vars.

oo::class create ModEval {
    variable mod
    constructor {modulo} {set mod $modulo}
    method value {literal} {return [expr {$literal}]}
    method variable {name} {return [expr {[set ::$name]}]}
    method + {a b} {return [expr {($a + $b) % $mod}]}
    method - {a b} {return [expr {($a - $b) % $mod}]}
    method * {a b} {return [expr {($a * $b) % $mod}]}
    method / {a b} {return [expr {($a / $b) % $mod}]}
    method ** {a b} {
        # Tcl supports bignums natively, so we use the naive version
        return [expr {($a ** $b) % $mod}]
    }
    export + - * / **
}

Pulling the pieces together

All we now need to do is to set the particular evaluation strategy (mod-13 math in this case), compile a particular expression, and evaluate it.

set comp [CompileAST new [ModEval create mod13 13]]
set compiled [$comp compile {$x**100 + $x + 1}]
set x 10
puts "[eval $compiled] = $compiled"

Which produces this output:

1 = ::mod13 + [::mod13 + [::mod13 ** [::mod13 variable x] [::mod13 value 100]] [::mod13 variable x]] [::mod13 value 1]