Version 4 of Expression parsing

Updated 2008-07-18 09:04:02 by lars_h

Also see parsing expressions


Richard Suchenwirth 2002-06-09 - Arithmetics in Tcl is made easy with the expr command, which even, in contrast to Tcl's Polish (operator first, then operands) style, accepts the popular (from C etc.) infix notation:

 ($a + $b) * $c

Now let's assume we didn't have expr, but wanted to build up some arithmetic routines, from only incr with +1 or -1. This is of course limited to integers, and to make things for me easier, let's further assume we're only dealing with "natural numbers" 0, 1, 2, 3, 4... Rebuilding addition and multiplication with two operands isn't hard (though of course slower than expr):

 proc + {a b} {
    if {$a==0} {
        set b ;# Peano axiom #2
    } else {
        + [incr a -1] [incr b]
    }
 } ;# of course, proc + {a b} {incr a $b} is faster...

 proc * {a b} {
    if {$a==0} {
        return 0 ;# Peano axiom #4
    } else {
        + $b [* [incr a -1] $b]
    }
 }

if 0 {This works for the simple cases (as said, exactly two natural numbers as operands), but for expressions like the one above we'd need a parser that produces Polish notation according to the two-operand restriction. Here's my Sundax evening version of a highly recursive expression parser, which takes care of arbitrarily nested parentheses (by rewriting them to bracketed calls to itself ;-), and the precedence of multiplication over addition, and works its way through constructs like

 a + ( b + c ) * d * e * f

which it turns to the corresponding "parse tree", finally even substituting braces into brackets, so the resulting string

 + a [* [* [* [[+ b c]] d] e] f]]

is almost ready for evalling, to call our weak + and * above... As evident from the letters used, this parsing was merely symbolic processing. In their place, natural numbers might have stood as well. :}

 proc s {args} {
    # parse "(a + b) * c + d" to "+ [* [+ a b] c] d"
    if {[llength $args]==1} {set args [lindex $args 0]}
        if [regexp {[()]} $args] {
                eval s [string map {( "\[s " ) \]} $args]
        } elseif {[lsearch -exact $args *]>=0} {
        s [s'group args *]
        } elseif {[lsearch -exact $args +]>=0} {
            s [s'group args +]
        } else {
                string map {\{ \[ \} \]} [join $args]
        }
 }
 proc s'group {listName op} {
    # turn ".. a op b .." to ".. {op a b} .."
    upvar 1 $listName list
    set pos [lsearch -exact $list $op]
    set p_1 [expr {$pos-1}]; set p1 [expr {$pos+1}]
    set list [lreplace $list $p_1 $p1 \
            [list [lindex $list $pos] \
                  [lindex $list $p_1] \
                  [lindex $list $p1]]]
 }