dis2asm gets things done

Richard Suchenwirth 2013-11-30 - Another chapter in the dis2asm saga: a converter from the language produced by the tcl::unsupported::disassemble command (briefly "dis"), to the language expected by the tcl::unsupported::assemble command, also known as TAL (Tcl Assembly Language). Both are varieties of assembler for Tcl's bytecode engine, but there are also a number of variations between them, which dis2asm tries to fix.

One remaining issue on the dis2asm page was that a "done" instruction (with a semantics like return, but not available in TAL, so I just skipped it so far) produced bad code if not at its end. Example (the "done" in question is at position (9)):

% aproc f x {if {$x > 0} {return 1} else {return 0}} -x
proc f x {asm {
   load x                   ;# (0) loadScalar1 %v0         # var "x"
   push 0                   ;# (2) push1 0         # "0"
   gt                       ;# (4) gt
   jumpFalse L12            ;# (5) jumpFalse1 +7         # pc 12
   push 1                   ;# (7) push1 1         # "1"
                            ;# (9) done
                            ;# (10) nop
                            ;# (11) nop
 label L12;
   push 0                   ;# (12) push1 0         # "0"
                            ;# (14) done
                            ;# (15) done
}}
%  f 2
inconsistent stack depths on two execution paths

I had already proposed a fix for this on dis2asm: convert "done" to "jump Done" when not at or near the end of the input, and add a "label Done" at the very end of the output. The TAL output now runs well, and looks like

% aproc f x {if {$x > 0} {return 1} else {return 0}} -x
proc f x {asm {
   load x                   ;# (0) loadScalar1 %v0         # var "x"
   push 0                   ;# (2) push1 0         # "0"
   gt                       ;# (4) gt
   jumpFalse L12            ;# (5) jumpFalse1 +7         # pc 12
   push 1                   ;# (7) push1 1         # "1"
   jump Done                ;# (9) done
                            ;# (10) nop
                            ;# (11) nop
 label L12;
   push 0                   ;# (12) push1 0         # "0"
                            ;# (14) done
                            ;# (15) done
 label Done;
}}
% f 42
1
% f -42
0

So here is the latest version of dis2asm that gets things done. I had to add a line counter, so we know where in the input we are, and also made the output format a little tidier.

proc dis2asm body {
    set fstart " push -1; store @p; pop   "
    set fstep  " incrImm @p +1;load @l;load @p
   listIndex;store @i;pop
   load @l;listLength;lt    "
    set res  ""
    set wait ""
    set jumptargets {}
    set lines [split $body \n]
    foreach line $lines { ;#-- pass 1: collect jump targets
        if [regexp {\# pc (\d+)} $line -> pc] {lappend jumptargets $pc}
    }
    set lineno 0
    foreach line $lines { ;#-- pass 2: do the rest
        incr lineno
        set line [string trim $line]
        if {$line eq ""} continue
        set code ""
        if {[regexp {slot (\d+), (.+)} $line -> number descr]} {
            set slot($number) $descr
        } elseif {[regexp {data=.+loop=%v(\d+)} $line -> ptr]} {
            #got ptr, carry on
        } elseif {[regexp {it%v(\d+).+\[%v(\d+)\]} $line -> copy number]} {
            set loopvar [lindex $slot($number) end]
            if {$wait ne ""} {
                set map [list @p $ptr @i $loopvar @l $copy]
                set code [string map $map $fstart]
                append res "\n  $code ;# $wait"
                set wait ""
            }
        } elseif {[regexp {^ *\((\d+)\) (.+)} $line -> pc instr]} {
            if {$pc in $jumptargets} {append res "\n label L$pc;"}
            if {[regexp {(.+)#(.+)} $instr -> instr comment]} {
                set arg [list [lindex $comment end]]
                if [string match jump* $instr] {set arg L$arg}
            } else {set arg ""}
            set instr0 [normalize [lindex $instr 0]]
            switch -- $instr0 {
                concat - invokeStk {set arg [lindex $instr end]}
                incrImm   {set arg [list $arg [lindex $instr end]]}
            }
            set code "$instr0 $arg"
            switch -- $instr0 {
                done          {
                    if {$lineno < [llength $lines]-2} {
                        set code "jump Done"
                    } else {set code ""}
                }
                startCommand  {set code ""}
                foreach_start {set wait $line; continue}
                foreach_step  {set code [string map $map $fstep]}
            }
            append res "\n   [format %-24s $code] ;# $line"
        }
    }
    append res "\n label Done;\n"
    return $res
}