''[Gerald Lester] - Actually this is a simulator, an [Assembler] would take the code and spit out a binary object file to run on the hardware. Of course a simulator is *much* harder to write than an Assembler, which makes this all the more impressive!'' [Richard Suchenwirth] - In the [Tcl and other languages] series, one of the oldest was still missing: Assembler (sort of a symbolic wrapper around pure machine code). Now here is the beginning for that... As this is a weekend fun project, I picked those parts of Intel 8080A/8085 Assembler (because I had a detailed reference handy) that are easily implemented and still somehow educational (or nostalgic ;-). Of course this is no real assembler. The memory model is constant-size instructions (strings in array elements), which are implemented as Tcl procs. So an "assembler" program in this plaything will run even slower than in pure Tcl, and consume more memory - while normally you associate speed and conciseness with "real" assembler code. But it looks halfway like the real thing: you get sort of an assembly listing with symbol table, and can ''run'' it - I'd hardly start writing an assembler in C, but in Tcl it's fun for a sunny Sunday afternoon... ====== namespace eval asm { proc asm body { variable mem catch {unset mem} ;# good for repeated sourcing foreach line [split $body \n] { foreach i {label op args} {set $i ""} regexp {([^;]*);} $line -> line ;# strip off comments regexp {^ *(([A-Z0-9]+):)? *([A-Z]*) +(.*)} [string toupper $line]\ -> - label op args puts label=$label,op=$op,args=$args if {$label!=""} {set sym($label) $PC} if {$op==""} continue if {$op=="DB"} {set mem($PC) [convertHex $args]; incr PC; continue} if {$op=="EQU"} {set sym($label) [convertHex $args]; continue} if {$op=="ORG"} {set PC [convertHex $args]; continue} regsub -all ", *" $args " " args ;# normalize commas set mem($PC) "$op $args" incr PC } substituteSymbols sym dump sym } proc convertHex s { if [regexp {^([0-9A-F]+)H$} [string trim $s] -> s] {set s [expr 0x$s]} set s } proc substituteSymbols {_sym} { variable mem upvar $_sym sym foreach i [array names mem] { set tmp [lindex $mem($i) 0] foreach j [lrange $mem($i) 1 end] { if {[array names sym $j]==$j} {set j $sym($j)} lappend tmp $j } set mem($i) $tmp } } proc dump {_sym} { variable mem upvar $_sym sym foreach i [lsort -integer [array names mem]] { puts [format "%04d %s" $i $mem($i)] } foreach i [lsort [array names sym]] { puts [format "%-10s: %04x" $i $sym($i)] } } proc run {{pc 255}} { variable mem foreach i {A B C D E Z} {set ::$i 0} while {$pc>=0} { incr pc #puts "$mem($pc)\tA:$::A B:$::B C:$::C D:$::D E:$::E Z:$::Z" eval $mem($pc) } } #----------------- "machine opcodes" implemented as procs proc ADD {reg reg2} {set ::Z [incr ::$reg [set ::$reg2]]} proc ADI {reg value} {set ::Z [incr ::$reg $value]} proc CALL {name} {[string tolower $name] $::A} proc DCR {reg} {set ::Z [incr ::$reg -1]} proc INR {reg} {set ::Z [incr ::$reg]} proc JMP where {uplevel 1 set pc [expr $where-1]} proc JNZ where {if $::Z {uplevel 1 JMP $where}} proc JZ where {if !$::Z {uplevel 1 JMP $where}} proc MOV {reg adr} {variable mem; set ::$reg $mem($adr)} proc MVI {reg value} {set ::$reg $value} } #-- Now testing: asm::asm { org 100 ; the canonical start address in CP/M jmp START ; idiomatic: get over the initial variable(s) DONE: equ 0 ; warm start in CP/M ;-) MAX: equ 5 INCR: db 2 ; a variable (though we won't vary it) ;; here we go... START: mvi c,MAX ; set count limit mvi a,0 ; initial value mov b,INCR LOOP: call puts ; for now, fall back to Tcl for I/O inr a add a,b ; just to make adding 1 more complicated dcr c ; counting down.. jnz LOOP ; jump on non-zero to LOOP jmp DONE ; end of program end } ====== The ''mov b,INCR'' part is an oversimplification. For a real 8080, one would have to say LXI H,INCR ; load double registers H+L with the address INCR MOV B,M ; load byte to register B from the address pointed to in HL Since the pseudo-register M can also be used for writing back, it cannot be implemented by simply copying the value. Rather, one could use read and write traces on variable M, causing it to load from, or store to, mem($HL). Maybe another weekend... ---- [Keith Vetter] - How about a MIX simulator so I can get all of [Knuth]'s algorithms working? ---- ''[escargo] 26 Mar 2004'' - I have sometimes wondered if there should be an assembly language level interface to the Tcl byte code. * [AK] See [TAL] "Tcl Assembly Language". ---- ''[JM] 11 Nov 2012'' - In order to actually running this example, I had to change 2 things: * I added the line: incr pc -1 to the top of the "run" proc, so you can start the execution like this: asm::run 100 instead of: asm::run 99 when the starting address is precisely 100d,and... * I added the line: set s [string trim $s] to the top of the "convertHex" proc, so the ORG's args are properly processed when such arg is decimal not hex. After running the example, the output is: label=,op=,args= label=,op=ORG,args=100 label=,op=JMP,args=START label=DONE,op=EQU,args=0 label=MAX,op=EQU,args=5 label=INCR,op=DB,args=2 label=,op=,args= label=START,op=MVI,args=C,MAX label=,op=MVI,args=A,0 label=,op=MOV,args=B,INCR label=LOOP,op=CALL,args=PUTS label=,op=INR,args=A label=,op=ADD,args=A,B label=,op=DCR,args=C label=,op=JNZ,args=LOOP label=,op=JMP,args=DONE label=,op=,args=END label=,op=,args= 0100 JMP 102 0101 2 0102 MVI C 5 0103 MVI A 0 0104 MOV B 101 0105 CALL PUTS 0106 INR A 0107 ADD A B 0108 DCR C 0109 JNZ 105 0110 JMP 0 DONE : 0000 INCR : 0065 LOOP : 0069 MAX : 0005 START : 0066 (code) 1 % asm::run 100 0 3 6 9 12 (code) 2 % See Also: [Playing PIC Simulation] <> Language | Tcl and other languages | Arts and crafts of Tcl-Tk programming