[Richard Suchenwirth] 2013-11-28 - Since Tcl 8.0, [proc] bodies are byte-compiled. Since 8.5, we can use the tcl::unsupported::disassemble command to see how it is done in the Tcl Assembly language, [TAL]. Since 8.6, there is also a sort-of converse function tcl::unsupported::assemble to turn a symbolic [assembler] notation into bytecode. However, there are quite many inequalities between the language that is produced by '''disassemble''' (call it "dis"), and the language accepted by '''assemble''' (call it "asm"). Some can be translated (e.g. "incrScalar1Imm" in "dis" corresponds to "incrImm" in "asm", see the ''normalize'' function below), for others I haven't found a way yet. "done" (something like [return]) in "dis" can be ignored when at the end of the code, but not in the middle of it. [foreach] winds up as two instructions in "dis", but I haven't found matches for them in "asm". Though ideally, one would expect to be able to feed the disassemble output into assemble to get the same results... The following code tries to bridge that gap. It is far from feature-complete and will need much more work. But it at least works in several simple cases (see the examples at bottom), and will be extended to cover more. In any case, the "proof of the pudding" is: * write a proc, e.g. "f", in Tcl * disassemble its body * use [dis2asm] to convert the disassembly to TAL (input) language * test it by calling it (only then the assembly takes place), see if it works like before Some shortcuts to start with: interp alias {} asm {} ::tcl::unsupported::assemble interp alias {} disasm {} ::tcl::unsupported::disassemble I have extended the ''aproc'' function, which before just returned the disassembly, to accept an extra -x flag to "reassemble" the disassembly, and [eval] it as a proc, so you can test it just by calling it. The original disassembly code is also shown in a comment. ====== proc aproc {name argl body args} { proc $name $argl $body set res [disasm proc $name] if {"-x" in $args} { set res [list proc $name [add_locals $argl $res] [list asm [dis2asm $res]]] eval $res } return $res } ====== This workaround scans the disassembly for local vars that are not in the argument lists, and appends them to it as optionals, defaulting to "". Otherwise, reassembly fails with cannot use this instruction to create a variable in a non-proc context Note that the [awk] language has the same mechanism for "declaring" local variables since a long time... ====== proc add_locals {argl disasm} { foreach line [split $disasm \n] { if [regexp {slot.+scalar, "(.+)"} $line -> local] { lappend argl [list $local ""] } } return $argl } ====== ====== proc dis2asm body { set res "" set jumptargets {} foreach line [split $body \n] { if [regexp {\# pc (\d+)} $line -> pc] {lappend jumptargets $pc} } foreach line [split $body \n] { set line [string trim $line] if {$line eq ""} continue set code "" if {[regexp {\((\d+)\) (.+)} $line -> pc instr]} { if {$pc in $jumptargets} { append res "\n label L$pc;" } if {[regexp {(.+)#(.+)} $instr -> instr comment]} { set arg [lindex $comment end] if {$arg eq ""} {set arg "{}"} if [string match jump* $instr] {set arg L$arg} } else {set arg ""} set instr0 [normalize [lindex $instr 0]] if {$instr0 in {invokeStk}} {set arg [lindex $instr end]} if {$instr0 in {incrImm}} {set arg [list $arg [lindex $instr end]]} set code [format " %-24s" "$instr0 $arg"] if {$instr0 in {startCommand}} {set code ""} append res "\n $code ;# [string trim $line]" } } append res \n return $res } ====== This translates "dis" instruction names to "asm" instruction names, where different: ====== proc normalize instr { regsub {\d+$} $instr "" instr ;# strip off length indicator set instr [string map { loadScalar load done "" nop "" storeScalar store incrScalar1Imm incrImm } $instr] return $instr } ====== Now to try it out. ====== % aproc f x {expr {sqrt($x)+1}} -x proc f x {asm { push tcl::mathfunc::sqrt ;# (0) push1 0 # "tcl::mathfunc::sqrt" load x ;# (2) loadScalar1 %v0 # var "x" invokeStk 2 ;# (4) invokeStk1 2 push 1 ;# (6) push1 1 # "1" add ;# (8) add ;# (9) done}} % f 2 2.414213562373095 ====== The following test shows that while mostly we have to remove parts of the disassembly, for jump targets we need to insert '''label''' pseudo-instructions: ====== % aproc f x {if {$x==1} {set x 2};return $x} -x proc f x {asm { load x ;# (0) loadScalar1 %v0 # var "x" push 1 ;# (2) push1 0 # "1" eq ;# (4) eq jumpFalse L13 ;# (5) jumpFalse1 +8 # pc 13 push 2 ;# (7) push1 1 # "2" store x ;# (9) storeScalar1 %v0 # var "x" jump L15 ;# (11) jump1 +4 # pc 15 label L13; push {} ;# (13) push1 2 # "" label L15; pop ;# (15) pop load x ;# (16) loadScalar1 %v0 # var "x" ;# (18) done}} % f 3 3 % f 1 2 ====== ====== % aproc hypot {x y} {expr {sqrt($x**2+$y**2)}} -x proc hypot {x y} {asm { push tcl::mathfunc::sqrt ;# (0) push1 0 # "tcl::mathfunc::sqrt" load x ;# (2) loadScalar1 %v0 # var "x" push 2 ;# (4) push1 1 # "2" expon ;# (6) expon load y ;# (7) loadScalar1 %v1 # var "y" push 2 ;# (9) push1 1 # "2" expon ;# (11) expon add ;# (12) add invokeStk 2 ;# (13) invokeStk1 2 tryCvtToNumeric ;# (15) tryCvtToNumeric ;# (16) done}} % hypot 3 4 5.0 ====== ====== % aproc f x {incr x -1} -x proc f x {asm { incrImm x -1 ;# (0) incrScalar1Imm %v0 -1 # var "x" ;# (3) done}} % f 5 4 ====== Local variables seem only to be possible if they are, possibly with default values, in the argument list. See also my "add_locals" workaround above. Given that, [while] loops can already be handled: ====== % aproc f {x {i 0}} {while {$i <= $x} {puts $i; incr i}} -x proc f {x {i 0}} {asm { jump L22 ;# (0) jump1 +22 # pc 22 label L2; push puts ;# (2) push1 0 # "puts" load i ;# (4) loadScalar1 %v1 # var "i" invokeStk 2 ;# (6) invokeStk1 2 pop ;# (8) pop ;# (9) startCommand +12 1 # next cmd at pc 21 incrImm i +1 ;# (18) incrScalar1Imm %v1 +1 # var "i" pop ;# (21) pop label L22; load i ;# (22) loadScalar1 %v1 # var "i" load x ;# (24) loadScalar1 %v0 # var "x" le ;# (26) le jumpTrue L2 ;# (27) jumpTrue1 -25 # pc 2 push {} ;# (29) push1 1 # "" ;# (31) done }} ====== ... or, what also works, to push the name of a local variable and use (load|store)Stk instructions: ====== % proc f x {asm { push i; push 42; storeStk; # "i" := "42" pop; # needed - otherwise stack is unbalanced in the end push i; loadStk; dup; add }} % f x 84 ====== But plain "load" didn't work here (again, now fixed with the "add_locals" workaround above): ====== % proc f x {asm {push i;push 42;storeStk;load i; dup; add}} % f x cannot use this instruction to create a variable in a non-proc context ====== ---- '''What doesn't work yet''' The following illustrates the issue of mid-code "done" (see (9)).. we can't convert it to input TAL. Maybe a "jump" to a label at the end might help? More specifically, when a "done" is met in non-final position, it shall be converted to "jump done", and a "label done" added at the end. ====== % aproc f x {if {$x > 0} {return 1} else {return 0}} -x proc f x {asm { load x ;# (0) loadScalar1 %v0 # var "x" push 0 ;# (2) push1 0 # "0" gt ;# (4) gt jumpFalse L12 ;# (5) jumpFalse1 +7 # pc 12 push 1 ;# (7) push1 1 # "1" ;# (9) done ;# (10) nop ;# (11) nop label L12; push 0 ;# (12) push1 0 # "0" ;# (14) done ;# (15) done}} % f 1 inconsistent stack depths on two execution paths ====== The above error came from a Tclkit 8.6.1 on a Lubuntu netbook. The same example on tclsh 8.6b2 on Win XP works there (with the solution I had in mind - jump to final "done" ;^) ====== % aproc f x {if {$x > 0} {return 1} else {return 0}} -x proc f x {asm { load x ;# (0) loadScalar1 %v0 # var "x" push 0 ;# (2) push1 0 # "0" gt ;# (4) gt jumpFalse L21 ;# (5) jumpFalse1 +16 # pc 21 ;# (7) startCommand +12 1 # next cmd at pc 19 push 1 ;# (16) push1 1 # "1" ;# (18) done jump L33 ;# (19) jump1 +14 # pc 33 label L21; ;# (21) startCommand +12 1 # next cmd at pc 33 push 0 ;# (30) push1 0 # "0" ;# (32) done label L33; ;# (33) done }} 37 % f 3 1 ====== Especially the "dis" language generated for [foreach] looks so different that at the moment I'm only puzzled. The language accepted by "asm" has no instruction beginning with "foreach"...: ====== 137 % aproc f x {foreach i $x {puts $i}} ByteCode 0x01513B48, refCt 1, epoch 5, interp 0x00950B70 (epoch 5) Source "foreach i $x {puts $i}" Cmds 2, src 22, inst 29, litObjs 2, aux 1, stkDepth 2, code/src 0.00 Proc 0x01523EE8, refCt 1, args 1, compiled locals 4 slot 0, scalar, arg, "x" slot 1, scalar, temp slot 2, scalar, temp slot 3, scalar, "i" Exception ranges 1, depth 1: 0: level 0, loop, pc 17-22, continue 10, break 26 Commands 2: 1: pc 0-27, src 0-21 2: pc 17-22, src 14-20 Command 1: "foreach i $x {puts $i}" (0) loadScalar1 %v0 # var "x" (2) storeScalar1 %v1 # temp var 1 (4) pop (5) foreach_start4 0 [data=[%v1], loop=%v2 it%v1 [%v3]] (10) foreach_step4 0 [data=[%v1], loop=%v2 it%v1 [%v3]] (15) jumpFalse1 +11 # pc 26 Command 2: "puts $i" (17) push1 0 # "puts" (19) loadScalar1 %v3 # var "i" (21) invokeStk1 2 (23) pop (24) jump1 -14 # pc 10 (26) push1 1 # "" (28) done ====== <>Category Example