Updated for revision 5
George Peter Staplin: I read the page Scripted Code Generation and noticed that there wasn't any machine code generator/assembler, so I came up with some code over the period of two days.
First I needed a way to allocate executable memory, so I borrowed code from my Perpheon project for alloc_exec_mem() and free_exec_mem() which uses mmap() where appropriate. My next step was to design an API for appending instructions to the mmap'd memory.
Here is the API provided for interfacing with the mmap'd memory:
#define cmd(func,name) \ Tcl_CreateObjCommand (interp, name, func, (ClientData) NULL, (Tcl_CmdDeleteProc *) NULL) cmd (asm_align, "asm.align"); cmd (asm_alloc, "asm.alloc"); cmd (asm_append_byte, "asm.append.byte"); cmd (asm_append_int, "asm.append.int"); cmd (asm_execute, "asm.execute"); cmd (asm_fetch_offset, "asm.fetch.offset"); cmd (asm_free, "asm.free"); cmd (asm_subtract_offsets, "asm.-.offsets"); cmd (asm_store, "asm.store"); #undef cmd
As you can see, it's quite minimal. I then began the work of reading the Intel manual for finding the codes needed. I used the Intel manual to analyze code like:
s: asm ("movl $-1,(%esp)"); e: dump_bytes (stderr, &&s, (&&e - &&s));
I wanted my assembler to be more structured than a bunch of gotos, so I borrowed from Forth the constructs IF/THEN. I however prefer the terms THEN/OTHERWISE because they make more sense (to me).
Consider code like:
1 2 < then 5 6 7 otherwise
The 1 and 2 get pushed. < pops 2 and compares and pushes a result of the comparison. "then" pops and compares to 0 and if the comparison is true it jumps to the position of otherwise. During the compilation of then/otherwise, "then" appends some code for the pop and 0 comparison, and then two values are pushed onto a compile-time stack so that we can fill in the relative amount to jump for reaching the position of "otherwise". As "otherwise" is run it pops two addresses off the stack and stores the difference into the memory, so that we know where to jump. The end result is that THEN/OTHERWISE constructs may be nested.
You can download a tarball here: http://www.xmission.com/~georgeps/implementation/software/asm/ (latest is release 5. It works in NetBSD 2.0 x86 and should work with other systems (with minimal makefile changes).)
I wrote the sources using Fed Builder; which greatly reduced the amount of time needed. There is only one file to compile, because the others are #include'd.
REVISION HISTORY
#5 fixes an alignment error in the old asm.assemble. I had forgotten to align prior to appending an integer in asm.assemble. I added more overflow checks (silly checks that exit). asm.align now takes a NOP char/byte (in preparation for a port to the Sparc).
Here is the core of the assembler:
#Copyright 2004 George Peter Staplin #Revision 5 load ./asm.so ;#portable C code for managing the bytes, alignment, etc. set ::ops(+) add set ::ops(-) sub set ::ops(<) less_than set ::ops(>) greater_than set ::ops(drop) drop set ::ops(dup) dup set ::ops(otherwise) otherwise set ::ops(then) then #R1 is %eax #R2 is %ecx #These opcodes are from dumps and section 3.3 of the processor manual. set ::ADD_R1_STACK "\x01\x04\x24" set ::COMPARE_R1_TO_R2 "\x39\xc1" set ::DROP "\x83\xc4\x04" set ::DUP "\x8b\x04\x24\x50" set ::JUMP_EQUAL_RELATIVE_INT "\x0f\x84" set ::JUMP_GREATER_THAN_RELATIVE_INT "\x0f\x8f" set ::JUMP_LESS_THAN_RELATIVE_INT "\x0f\x8c" set ::JUMP_RELATIVE_INT "\x0f\x84" set ::MOVE_0_INTO_TOP "\xc7\x04\x24\x00\x00\x00\x00" set ::MOVE_0_INTO_R2 "\xb9\x00\x00\x00\x00" set ::NOP "\x90" set ::POP_INTO_R1 "\x58" set ::POP_INTO_R2 "\x59" set ::PUSH_WORD "\x68" set ::SUB_R1_STACK "\x29\x04\x24" set ::assemble_stack [list] proc add exe { asm.append.byte $exe $::POP_INTO_R1 asm.append.byte $exe $::ADD_R1_STACK } proc drop exe { asm.append.byte $exe $::DROP } proc dup exe { asm.append.byte $exe $::DUP } proc greater_than exe { asm.append.byte $exe $::POP_INTO_R1 asm.append.byte $exe $::POP_INTO_R2 asm.align $exe $::NOP 3 asm.append.byte $exe $::PUSH_WORD asm.append.int $exe -1 asm.append.byte $exe $::COMPARE_R1_TO_R2 asm.append.byte $exe $::JUMP_GREATER_THAN_RELATIVE_INT asm.append.int $exe 7 asm.append.byte $exe $::MOVE_0_INTO_TOP } proc less_than exe { asm.append.byte $exe $::POP_INTO_R1 asm.append.byte $exe $::POP_INTO_R2 asm.align $exe $::NOP 3 asm.append.byte $exe $::PUSH_WORD asm.append.int $exe -1 asm.append.byte $exe $::COMPARE_R1_TO_R2 asm.align $exe $::NOP 2 asm.append.byte $exe $::JUMP_LESS_THAN_RELATIVE_INT asm.append.int $exe 7 ;#jump 7 bytes asm.append.byte $exe $::MOVE_0_INTO_TOP } proc otherwise exe { set s [pop] set e [asm.fetch.offset $exe] set jmp_addr [pop] asm.store $jmp_addr [asm.-.offsets $e $s] } proc pop {} { set r [lindex $::compile_stack end] set ::compile_stack [lrange $::compile_stack 0 end-1] return $r } proc push val { lappend ::compile_stack $val } proc sub exe { asm.append.byte $exe $::POP_INTO_R1 asm.append.byte $exe $::SUB_R1_STACK } proc then exe { asm.append.byte $exe $::POP_INTO_R1 asm.append.byte $exe $::MOVE_0_INTO_R2 asm.append.byte $exe $::COMPARE_R1_TO_R2 asm.append.byte $exe $::JUMP_EQUAL_RELATIVE_INT push [asm.fetch.offset $exe] asm.append.int $exe 0 push [asm.fetch.offset $exe] } proc asm.assemble s { set exe [asm.alloc] foreach i [split $s] { set i [string trim $i] if {"" eq $i} continue if {[string is integer $i]} { asm.align $exe $::NOP 3 asm.append.byte $exe $::PUSH_WORD asm.append.int $exe $i } else { $::ops($i) $exe } } return $exe } proc main {} { set exe [asm.assemble { 5 10 < then 4 5 + 8 > then 1234 otherwise otherwise }] puts [asm.execute $exe] asm.free $exe } main
Lars H: Looks nice (although it was ages since I did any Intel assembler). One thought, though: Would it make sense to make the higher assembler constructs more similar to Tcl? Then instead of
4 5 + 8 > then 1234 otherwise
one would (perhaps) write something like
if { 4 5 + 8 > } then { 1234 }
(with optional else/elseif clauses and all). I suppose it could make it more complicated to parsing the assembler.
George Peter Staplin: It would make it more complex to parse. Feel free to implement and share if you want. After looking at the code today it occured to me that an OOP approach to the structure of the assembler would work well (and get rid of so many globals), and it might allow for the type of code structuring you mention. :)