RPN in Tcl

Richard Suchenwirth 2001-06-01 - There are exactly three ways of horizontally placing a binary operator (e.g. +) relative to its operands (call them A and B): before, between, or behind.

The "between" style (infix notation, A + B) we're most familiar with, from arithmetics in daily life, C, and Tcl's expr command. Here you need operator precedence rules and/or parens to indicate the intended execution sequence.

  (1 + 2) * (3 + 4)    -- C, tcl/expr, ...

The "before" style (prefix or "Polish" notation, + A B) is unambiguous, but execution sequence goes not necessarily from left to right. Lisp and Tcl (mostly) use this pattern, but still need parens or brackets.

 (* (+ 1 2) (+ 3 4))    -- LISP

The "behind" style (postfix or "Reverse Polish (RPN)" notation, A B +) is used in Forth (see Trying FORTH in Tcl), PostScript and many pocket calculators. No parens are needed at all, so both languages reuse them for different purposes: Forth for comments, PostScript as string delimiters. When you meet an operator in an RPN language, all its operands are already set up, so it's easiest implemented. That's why I picked it for this weekend's fun project... an interpreter for a tiny subset of Forth - but you could adjust this with few changes for Postscript, maybe except for all the graphics rendering engine ;-)

 1 2 + 3 4 + *      -- Forth

RPN languages are heavily stack-centered, so I create a global list Stack where a push appends (possibly more than one element) to the end, as is the habit in Forth display, while a pop returns the last element, assigns it to a var if specified, and truncates the list by 1:

 proc push args {eval lappend ::Stack $args}
 proc pop {args} {
    global Stack
    if ![llength $Stack] {error "stack underflow"}
    set res [lindex $Stack end]
    set Stack [lrange $Stack 0 [expr [llength $Stack]-2]]
    if [llength $args]==1 {upvar 1 $args there; set there $res}
    set res
 }

Initially, the stack and an array RPN for word definitions are set up (in Forth, word defs are also in a stack, but this was weekend after all), and operands from expr and some Forth basics are defined in Tcl - note that I use small integers as variable names in order not to conflict with other user code. Finally, some word definitions are done in Forth in their peculiar

 : name words that make up the body ;

syntax - in fact Forth "commands" could be called by just substituting their body for the name, but since RPN holds Tcl definitions, I call rpn with the specified word sequence.

 proc rpn:init {} {
    global RPN Stack
    set RPN(_mode) normal ;# may also be comment or code
    set Stack [list]
    ######### importing binary operators from expr...
    foreach op {+ - * / % == != > >= < <= && ||} {
        regsub @op {pop 1; pop 2; push [expr $2 @op $1]} $op RPN($op)
    }
    ######### defining some elementaries in Tcl...
    foreach {op code} {
        .    {puts -nonewline "[pop] "; flush stdout}
        .r   {puts -nonewline [format "%[pop]d " [pop]]}
        .s   {puts $Stack}
        abs  {push [expr abs([pop])]}
        cr   {puts ""}
        drop {pop}
        emit {puts -nonewline [format %c [pop]]}
        pick {push [lindex $Stack [expr -[pop]+[llength $Stack]-1]]}
        roll {
            set 1 [expr [llength $Stack]-[pop]-2]
            set 2 [lindex $Stack $1]
            set Stack [lreplace $Stack $1 $1] 
            push $2
        }
        :    {set RPN(_mode) code}
        "("  {set RPN(_mode) comment}
        +debug {set RPN(_mode) debug}
        -debug {set RPN(_mode) normal}
    } {set RPN($op) $code}
    ######## defining more elementaries in sub-Forth itself...
    rpn {
        : dup  0 pick ; 
        : mod  % ;
        : over 1 pick ;
        : rot  2 roll ;
        : swap 1 roll ;
    }
 }

And this is the main RPN "engine" that executes this kind of code. You can pass it a list of words or, convenient for interactive use, just omit the braces and let args pick its arguments up (except in word definitions, where it's better to hide the final semicolon from Tcl's parser):

 rpn {1 2 + 3 4 + *}  --> 21
 rpn 1 2 + 3 4 + *    --> 21

Each call to rpn returns the top of stack without modifying that, so for calculation tasks, you're almost set. You can see the whole stack from Tcl with set ::Stack, or from sub-Forth with .s. For testing, I added the words +debug and -debug that turn on/off a display of (word) stack for each word processed. Without arguments, you can call rpn like an interactive Forth system, and leave it with the word bye.

 proc rpn args {
    global RPN Stack
    if ![info exists Stack] rpn:init
     if {$args==""} {
         while {[lindex $Stack end]!="bye"} {
             puts -nonewline "ok "; flush stdout
             gets stdin line
             if [catch {rpn $line} res] {puts "ERROR: $res"}
         }
         puts [pop]
         return
     }
    if [llength $args]==1 {set args [lindex $args 0]}
    foreach word $args {
        switch -- $RPN(_mode) {
            normal - debug {
                if [info exists RPN($word)] {
                    eval $RPN($word)
                } else {
                    push $word
                }
                if {$RPN(_mode)=="debug"} {puts "($word) $Stack"}
            }
            comment {if {$word==")"} {set RPN(_mode) normal}}
            code    {
                if {$word==";"} {
                    set RPN(_mode) normal
                    set c $RPN(_code)
                    set RPN([lindex $c 0]) [concat rpn [lrange $c 1 end]]
                    set RPN(_code) {}
                } else {
                    lappend RPN(_code) $word}
                }
            }
    }
    lindex $Stack end
 }

A real Forth system stacks shorts/longs or other fixed-size stuff, but since a Tcl list can hold any string as elements, I'm not hurried to model the more complex string handling in Forth. Instead, all words not known as commands are pushed, and it's up to the popper what he does with them, so Forth's

 ." hello, world "

(dot-quote and quote being words that cause special handling of their content) is now simply

 "hello, world" .

- the dot, in Forth only for printing integers, here prints any string it finds on top of stack...


What's completely missing yet are control structures like

   ... (cond) IF (this) ELSE (that) THEN ...

To add those, more lookahead modes in the interpreter are needed. Maybe sometime later...


FORTH handles these with a separate stack I understand. Is there a way of doing that in Tcl? -BMA - RS: Of course - as a sensible implementation of a stack is a list, nobody prevents us from using more than one list :-)

Lars H: Postscript handles this with only one stack (even though there are on the whole several stacks in the language). There if is an operator which takes the boolean value and the then-branch of the condition as argument, like so

 2 2 + 4 eq {(Equality!) show} if

Similarly, ifelse takes a boolean, a then-branch, and an else-branch. All of it works like the If we had no if implementations of Tcl's if. I admit that it looks odd, though.

RS: Thanks - after a long time I've followed your hint, and rewrote this to RPN again.