dis2asm learns to catch

Richard Suchenwirth 2013-12-01 - Another chapter in the dis2asm saga. The set of accepted TAL instructions includes beginCatch and endCatch, so I wanted to try that out.

% aproc f x {catch {expr {1/$x}}} -x
proc f x {asm {
   beginCatch               ;# (0) beginCatch4 0
   push 1                   ;# (5) push1 0         # "1"
   load x                   ;# (7) loadScalar1 %v0         # var "x"
   div                      ;# (9) div
   pop                      ;# (10) pop
   push 0                   ;# (11) push1 1         # "0"
   jump L16                 ;# (13) jump1 +3         # pc 16
   pushReturnCode           ;# (15) pushReturnCode
 label L16;
   endCatch                 ;# (16) endCatch
                            ;# (17) done
 label Done;
}}
% f 4
wrong # args: should be "beginCatch label"

Hmm.. but which label should it be? The disassembly line (0) contains no evident label (0 would be its own position...). I pasted the generated asm proc into the editor and tried the first evident possibility - there is only a single label L16. But writing it after the beginCatch instruction, and retesting, brought:

 inconsistent stack depths on two execution paths

Looking closer at the TAL code, there is jump L16 two lines above, and then pushReturnCode which in this state is unreachable. It doesn't have a label, but we can easily assign one.

Apparently, when dis2asm encounters a beginCatch instruction, it must perform a look-ahead from the current position to find a suitable pushReturnCode. Of course catches can be nested (the depth might be indicated by the "0" argument in the disassembly), but as neither the matching pushReturnCode nor endCatch indicate at what nesting depth they are, we can only disallow nested catches for now, and take the first pushReturnCode that comes along. Easy, and as the body of dis2asm is already longer than I like, I wrote a proc for that:

proc findCatchEnd {lines lineno} {
    for {set i $lineno} {$i < [llength $lines]} {incr i} {
        if {[regexp {\((\d+)\) pushReturnCode} [lindex $lines $i] -> pc]} {
            return $pc
        }
    }
    error "could not find end of catch beginning at line $lineno"
}

Its call is placed in the command-specific switch in dis2asm:

...
            switch -- $instr0 {
                beginCatch {
                    set catchend [findCatchEnd $lines $lineno]
                    lappend code L$catchend
                    lappend jumptargets $catchend
                }
                done          {
...

It retrieves the program counter of the pushReturnCode, supplies it to the beginCatch instruction as required, and also puts it on the list of jump targets, so a label pseudo-instruction is inserted there when the time comes.

Testing shows that the generated TAL is now well-formed, and catch reacts as we expect:

% aproc f x {catch {expr {1/$x}}} -x
proc f x {asm {
   beginCatch L15           ;# (0) beginCatch4 0
   push 1                   ;# (5) push1 0         # "1"
   load x                   ;# (7) loadScalar1 %v0         # var "x"
   div                      ;# (9) div
   pop                      ;# (10) pop
   push 0                   ;# (11) push1 1         # "0"
   jump L16                 ;# (13) jump1 +3         # pc 16
 label L15;
   pushReturnCode           ;# (15) pushReturnCode
 label L16;
   endCatch                 ;# (16) endCatch
                            ;# (17) done
 label Done;
}}
% f 1
0
% f x
1

Testing further, now with the optional variable to hold the result:

% aproc f x {catch {expr {1/$x}} res} -x
proc f x {asm {
   beginCatch L15           ;# (0) beginCatch4 0
   push 1                   ;# (5) push1 0         # "1"
   load x                   ;# (7) loadScalar1 %v0         # var "x"
   div                      ;# (9) div
   push 0                   ;# (10) push1 1         # "0"
   jump L16                 ;# (12) jump1 +4         # pc 16
   pushResult               ;# (14) pushResult
 label L15;
   pushReturnCode           ;# (15) pushReturnCode
 label L16;
   endCatch                 ;# (16) endCatch
   reverse 2                ;# (17) reverse 2
   store res                ;# (22) storeScalar1 %v1         # var "res"
   pop                      ;# (24) pop
                            ;# (25) done
 label Done;
}}
% f 1
inconsistent stack depths on two execution paths

So the solution from above was not bullet-proof enough. Next approach:

  • the beginCatch is matched by the endCatch downstream. Note its program counter (16 in the example)
  • from the beginCatch position, search a line that jumps there (12 in the example)
  • advance one line from there (14) - return that program counter as result

This implementation passes both test cases well:

proc findCatchEnd {lines lineno} {
    set pc ""
    for {set i $lineno} {$i < [llength $lines]} {incr i} {
        if {[regexp {\((\d+)\) endCatch} [lindex $lines $i] -> pc]} break
    }
    if {$pc eq ""} {error "could not find end of catch for line $lineno"}
    for {set i $lineno} {$i < [llength $lines]} {incr i} {
        if {[string match "*jump*pc $pc*" [lindex $lines $i]]} {
            if {[regexp {\((\d+)\)} [lindex $lines $i+1] -> pc2]} {
                return $pc2
            }
        }
    }
    error "could not find jump source for $pc"
}
% aproc f x {catch {expr {1/$x}} res} -x
proc f x {asm {
   beginCatch L14           ;# (0) beginCatch4 0
   push 1                   ;# (5) push1 0         # "1"
   load x                   ;# (7) loadScalar1 %v0         # var "x"
   div                      ;# (9) div
   push 0                   ;# (10) push1 1         # "0"
   jump L16                 ;# (12) jump1 +4         # pc 16
 label L14;
   pushResult               ;# (14) pushResult
   pushReturnCode           ;# (15) pushReturnCode
 label L16;
   endCatch                 ;# (16) endCatch
   reverse 2                ;# (17) reverse 2
   store res                ;# (22) storeScalar1 %v1         # var "res"
   pop                      ;# (24) pop
                            ;# (25) done
 label Done;
}}
% f 1
0
% f x
1

A peephole optimizer as in dis2asm gets better might notice here that the local "res" variable is not used until return, and cancel out lines 14, 17, 22, 24, to save 8 bytes, and basically revert to the variable-less catch implementation:

% aproc f x {catch {expr {1/$x}}} -x
proc f x {asm {
   beginCatch L15           ;# (0) beginCatch4 0
   push 1                   ;# (5) push1 0         # "1"
   load x                   ;# (7) loadScalar1 %v0         # var "x"
   div                      ;# (9) div
   pop                      ;# (10) pop
   push 0                   ;# (11) push1 1         # "0"
   jump L16                 ;# (13) jump1 +3         # pc 16
 label L15;
   pushReturnCode           ;# (15) pushReturnCode
 label L16;
   endCatch                 ;# (16) endCatch
                            ;# (17) done
 label Done;
}}
% f 1
0
% f x
1