[Richard Suchenwirth] 2013-11-28 - Since Tcl 8.0, [proc] bodies are byte-compiled. Since 8.5, we can use the tcl::unsupported::disassemble command to see how it is done in the Tcl Assembly language, [TAL]. Since 8.6, there is also a sort-of converse function tcl::unsupported::assemble to turn a symbolic [assembler] notation into bytecode. However, there are quite many inequalities between the language that is produced by '''disassemble''' (call it "dis"), and the language accepted by '''assemble''' (call it "asm"). "The TAL that can't be assembled is not the real TAL", a Daoist might say.. Some instruction names can be translated (e.g. "incrScalar1Imm" in "dis" corresponds to "incrImm" in "asm", see the ''normalize'' function below), for others I haven't found a way yet. "done" (something like [return]) in "dis" can be ignored when at the end of the code, but not in the middle of it. [foreach] winds up as two instructions in "dis", but I haven't found matches for them in "asm". Though ideally, one would expect to be able to feed the disassemble output into assemble to get the same results... The following code tries to bridge that gap. It is far from feature-complete and will need much more work. But it at least works in several simple cases (see the examples at bottom), and will be extended to cover more. In any case, the "proof of the pudding" is: * write a proc, e.g. "f", in Tcl * disassemble its body * use [dis2asm] to convert the disassembly to TAL (input) language * test it by calling it (only then the assembly takes place), see if it works like before * possibly optimize the TAL code by hand or script (see [dis2asm gets better] for optimization potentials) - and test again Some shortcuts to start with: ====== #interp alias {} asm {} ::tcl::unsupported::assemble ;# worksn't - the [assemble] command isn't exported yet namespace eval tcl::unsupported {namespace export assemble} namespace import tcl::unsupported::assemble rename assemble asm interp alias {} disasm {} ::tcl::unsupported::disassemble ====== I have extended the ''aproc'' function, which before just returned the disassembly, to accept an extra -x flag to "reassemble" the disassembly, and [eval] it as a proc, so you can test it just by calling it. The original disassembly code is also shown in a comment. ====== proc aproc {name argl body args} { proc $name $argl $body set res [disasm proc $name] if {"-x" in $args} { set res [list proc $name $argl [list asm [dis2asm $res]]] eval $res } return $res } ====== This now is the converter from "dis" to TAL: ====== proc dis2asm body { set res "" set jumptargets {} foreach line [split $body \n] { if [regexp {\# pc (\d+)} $line -> pc] {lappend jumptargets $pc} } foreach line [split $body \n] { set line [string trim $line] if {$line eq ""} continue set code "" if {[regexp {\((\d+)\) (.+)} $line -> pc instr]} { if {$pc in $jumptargets} { append res "\n label L$pc;" } if {[regexp {(.+)#(.+)} $instr -> instr comment]} { set arg [lindex $comment end] if {$arg eq ""} {set arg "{}"} if [string match jump* $instr] {set arg L$arg} } else {set arg ""} set instr0 [normalize [lindex $instr 0]] if {$instr0 in {invokeStk}} {set arg [lindex $instr end]} if {$instr0 in {incrImm}} {set arg [list $arg [lindex $instr end]]} set code [format " %-24s" "$instr0 $arg"] if {$instr0 in {startCommand}} {set code ""} append res "\n $code ;# [string trim $line]" } } append res \n return $res } ====== This translates "dis" instruction names to "asm" instruction names, where different: ====== proc normalize instr { regsub {\d+$} $instr "" instr ;# strip off trailing length indicator set instr [string map { loadScalar load nop "" storeScalar store incrScalar1Imm incrImm } $instr] return $instr } ====== Now to try it out. ====== % aproc f x {expr {sqrt($x)+1}} -x proc f x {asm { push tcl::mathfunc::sqrt ;# (0) push1 0 # "tcl::mathfunc::sqrt" load x ;# (2) loadScalar1 %v0 # var "x" invokeStk 2 ;# (4) invokeStk1 2 push 1 ;# (6) push1 1 # "1" add ;# (8) add ;# (9) done}} % f 2 2.414213562373095 ====== The following test shows that while mostly we have to remove parts of the disassembly, for jump targets we need to insert '''label''' pseudo-instructions: ====== % aproc f x {if {$x==1} {set x 2};return $x} -x proc f x {asm { load x ;# (0) loadScalar1 %v0 # var "x" push 1 ;# (2) push1 0 # "1" eq ;# (4) eq jumpFalse L13 ;# (5) jumpFalse1 +8 # pc 13 push 2 ;# (7) push1 1 # "2" store x ;# (9) storeScalar1 %v0 # var "x" jump L15 ;# (11) jump1 +4 # pc 15 label L13; push {} ;# (13) push1 2 # "" label L15; pop ;# (15) pop load x ;# (16) loadScalar1 %v0 # var "x" ;# (18) done}} % f 3 3 % f 1 2 ====== Of course, tcl::mathfunc::[hypot] is C-coded and much faster than this... ====== % aproc hypot {x y} {expr {sqrt($x**2+$y**2)}} -x proc hypot {x y} {asm { push tcl::mathfunc::sqrt ;# (0) push1 0 # "tcl::mathfunc::sqrt" load x ;# (2) loadScalar1 %v0 # var "x" push 2 ;# (4) push1 1 # "2" expon ;# (6) expon load y ;# (7) loadScalar1 %v1 # var "y" push 2 ;# (9) push1 1 # "2" expon ;# (11) expon add ;# (12) add invokeStk 2 ;# (13) invokeStk1 2 tryCvtToNumeric ;# (15) tryCvtToNumeric ;# (16) done}} % hypot 3 4 5.0 ====== ====== % aproc f x {incr x -1} -x proc f x {asm { incrImm x -1 ;# (0) incrScalar1Imm %v0 -1 # var "x" ;# (3) done}} % f 5 4 ====== [while] loops can already be handled: ====== % aproc f x {while {$i <= $x} {puts $i; incr i}} -x proc f x {asm { jump L22 ;# (0) jump1 +22 # pc 22 label L2; push puts ;# (2) push1 0 # "puts" load i ;# (4) loadScalar1 %v1 # var "i" invokeStk 2 ;# (6) invokeStk1 2 pop ;# (8) pop ;# (9) startCommand +12 1 # next cmd at pc 21 incrImm i +1 ;# (18) incrScalar1Imm %v1 +1 # var "i" pop ;# (21) pop label L22; load i ;# (22) loadScalar1 %v1 # var "i" load x ;# (24) loadScalar1 %v0 # var "x" le ;# (26) le jumpTrue L2 ;# (27) jumpTrue1 -25 # pc 2 push {} ;# (29) push1 1 # "" ;# (31) done }} ====== [for] loops can also well be handled: ====== % aproc f x {for {set i 0} {$i<$x} {incr i} {puts $i}} -x proc f {x {i {}}} {asm { push 0 ;# (0) push1 0 # "0" store i ;# (2) storeScalar1 %v1 # var "i" pop ;# (4) pop jump L27 ;# (5) jump1 +22 # pc 27 label L7; push puts ;# (7) push1 1 # "puts" load i ;# (9) loadScalar1 %v1 # var "i" invokeStk 2 ;# (11) invokeStk1 2 pop ;# (13) pop ;# (14) startCommand +12 1 # next cmd at pc 26 incrImm i +1 ;# (23) incrScalar1Imm %v1 +1 # var "i" pop ;# (26) pop label L27; load i ;# (27) loadScalar1 %v1 # var "i" load x ;# (29) loadScalar1 %v0 # var "x" lt ;# (31) lt jumpTrue L7 ;# (32) jumpTrue1 -25 # pc 7 push {} ;# (34) push1 2 # "" ;# (36) done }} % f 4 0 1 2 3 % ====== ---- '''What didn't work yet''' The following illustrates the issue of mid-code "done" (see (9)).. we can't convert it to input TAL. Maybe a "jump" to a label at the end might help? More specifically, when a "done" is met in non-final position, it shall be converted to "jump done", and a "label done" added at the end. ====== % aproc f x {if {$x > 0} {return 1} else {return 0}} -x proc f x {asm { load x ;# (0) loadScalar1 %v0 # var "x" push 0 ;# (2) push1 0 # "0" gt ;# (4) gt jumpFalse L12 ;# (5) jumpFalse1 +7 # pc 12 push 1 ;# (7) push1 1 # "1" ;# (9) done ;# (10) nop ;# (11) nop label L12; push 0 ;# (12) push1 0 # "0" ;# (14) done ;# (15) done}} % f 1 inconsistent stack depths on two execution paths ====== The above error came from a Tclkit 8.6.1 on a Lubuntu netbook. The same example on tclsh 8.6b2 on Win XP works there (with the solution I had in mind - jump to final "done" which here is called "L33" ;^) ====== % aproc f x {if {$x > 0} {return 1} else {return 0}} -x proc f x {asm { load x ;# (0) loadScalar1 %v0 # var "x" push 0 ;# (2) push1 0 # "0" gt ;# (4) gt jumpFalse L21 ;# (5) jumpFalse1 +16 # pc 21 ;# (7) startCommand +12 1 # next cmd at pc 19 push 1 ;# (16) push1 1 # "1" ;# (18) done jump L33 ;# (19) jump1 +14 # pc 33 label L21; ;# (21) startCommand +12 1 # next cmd at pc 33 push 0 ;# (30) push1 0 # "0" ;# (32) done label L33; ;# (33) done }} 37 % f 3 1 ====== This was fixed in [dis2asm gets things done]. Another issue: the "dis" language generated for [foreach] looks so different that at the moment I was only puzzled. The language accepted by "asm" has no instruction beginning with "foreach"... The story continues at [dis2asm does macros] - [dis2asm gets things done] - [dis2asm gets better] - [dis2asm learns to catch] <>Category Example