if 0 {[Richard Suchenwirth] 2004-05-19 - From Tcl 8.0, bytecode compilation has remarkably increased execution speed - code (e.g. in [proc] or [while] bodies) is no longer interpreted word by word, character by character, but rather converted into a compact sequence of bytes once, and this byte sequence is executed by a "virtual machine", Tcl_ExecuteByteCode(), or TEBC for short, as often as needed. The concept isn't new. BASICs in the 1980s had that, as well as the "p-code" into which UCSD Pascal was converted. But it certainly has advantages over naive interpretation of strings as commands. I have to admit that Tcl's real bytecodes are still sort of a mystery to me. I looked at the C code that implements and executes it, but I never managed to really work on it (in regular Tcl builds, the bytecode debugging isn't very exposed; and the code set happens to change between minor releases). But one can always play...:) This page contains a tiny implementation of my vague ideas about a bytecode engine, much weaker than the real thing, but possibly educational. Compilation of bytecode (here restricted to the 94 printable ASCII characters) is done manually into a compact reverse-Polish language (see [RPN again]), in which no spaces are needed to separate "words", as they're all one byte long. This allows for shorter (and more mysterious-looking) code... The bytecode engine ''ebc'' (execute byte code) is stack based. Instructions get their arguments from the stack, and push their result to the stack. Unrelated with this stack, I simulate a "local stack frame" by assigning the arguments to ''ebc'' to variables $1..$9 (see [half-lambda]). Constants are limited to 0..9 in this toy; the (n)@ opcode retrieves the (n)-th argument. So Tcl's lambda {1 2} {expr ($1+$2)/3} comes out in this bytecode language as 1@2@+3/ You can also assign to such local variables, with the = opcode of course. As in TEBC, [expr] operators (plus some functions) are available as first-class "commands"; I added some other list commands as needed for the examples at end. Finally, the stack is returned. One could define commands in this way: interp alias {} hypot {} ebc 1@d*2@d*+s ;# sqrt($1*$1 + $2*$2) but of course this toy runs slower than Tcl's real bytecodes.- Like in Tcl, nesting {} braces can be used to group without substitution. This way, you can push and [eval] longer strings, e.g. {puts hello}e Conditional branching is done with the (i)f opcode: c{t}{e}f which, according to the condition c on top of stack, executes either the t(hen) or the e(lse) bytecode embeddedly. What's missing is looping constructs like [for], [while], [foreach]. For this, the simple [foreach] iteration isn't sufficient - one would need a "program counter" that points into the bytecode, and can move forward or back as required, as e.g. in [Playing Assembler]. } proc ebc {code args} { foreach {1 2 3 4 5 6 7 8 9} $args break set S {} ;# clear stack set keep 0; set buffer "" foreach byte [split $code ""] { #puts [list $code | $byte | $S | $keep | $buffer] if $keep { if {$byte eq "\}"} { if ![incr keep -1] {push S $buffer; set buffer ""} } if $keep {append buffer $byte} if {$byte eq "\{"} {incr keep} continue } switch -- $byte { 0 - 1 - 2 - 3 - 4 - 5 - 6 - 7 - 8 - 9 {push S $byte} @ {push S [set [pop S]] ;# retrieve numbered argument} + - "-" - * - / - % - > - < { swap S; push S [expr [pop S] $byte [pop S]]} . {swap S; push S [pop S][pop S] ;# string concatenation} = {set [pop S] [pop S] ;# assign to a numbered variable} , {puts -nonewline "[pop S] "} _ {} d {dup S} e {push S [eval [pop S]]} f {set else [pop S];set then [pop S] if [pop S] {push S [ebc $then]} {push S [ebc $else]}} I {swap S; push S [lindex [pop S] [pop S]]} i {push S [expr int([pop S])]} L {push S [llength [pop S]]} p {pop S} r {push S [expr rand()]} S {push S [lsort -real [pop S]]} s {push S [expr sqrt([pop S])]} x {ebc [pop S]} \{ {incr keep; set buffer "" ;# start embedded command} [ {push S \uffff ;# push magic marker} ] { # Pop to magic marker and make list set idx [expr [llength $S]-1] for {} {$idx>=0} {incr idx -1} { if {[lindex $S $idx] eq "\uffff"} { set list [lrange $S [expr $idx+1] end] set S [lrange $S 0 [expr $idx-1]] push S $list break } } } default {error "unknown bytecode $byte"} } } set S } #-- Stack routines: interp alias {} push {} lappend proc pop Sname { upvar 1 $Sname S if ![llength $S] {error "stack underflow"} K [lindex $S end] [set S [lrange $S 0 end-1]] } proc K {a b} {set a} proc dup Sname {upvar 1 $Sname S; push S [lindex $S end]} proc swap Sname {upvar 1 $Sname S; push S [pop S] [pop S]} #-- Testing, and usage examples: proc ? {body expected} { catch {uplevel 1 $body} res if {$res ne $expected} {puts "$body->$res, expected $expected"} } ? {ebc 123++} 6 ;# simple starter ? {ebc +} "stack underflow" ;# deliberate error ? {ebc 1@d* 5} 25 ;# square ? {ebc 1@d*2@d*+s 3 4} 5.0 ;# hypot() ? {ebc 1@dLr*iI {red green blue}} ? ;# random choice of a list element ? {ebc 1@2@+3=3@ 5 7} 12 ;# variable assignment & retrieval interp alias {} sgn {} ebc 1@0>1@0<- ;# rmax's sign function ? {sgn -42} -1 ? {sgn 0} 0 ? {sgn 42} 1 ? {ebc 1@S0I {1.2 1.1 2.3}} 1.1 ;# minimum of a list of numbers ? {ebc 1@S{e}I {1.2 1.1 2.3 0.4}} 2.3 ;# maximum of a list of numbers ? {ebc "12{puts hello}ep+"} 3 ;# embedded code to eval ? {ebc "12>{{yes}}{{no}}f"} no ? {ebc "12<{{yes}}{{no}}f"} yes if 0 { The test suite should only complain at a random color (I can't predict that ;-), and show "hello" from the [puts] test. [DKF]: Added concatenation and [postscript]-like list-building operators. Have fun. :) ---- [Arts and crafts of Tcl-Tk programming] }