Version 13 of returneval

Updated 2011-07-13 03:39:01 by RLE

A problem in Tcl (which is addressed, in various ways, by TIP #90 [L1 ]) is that the -code option to return has certain limitations; in particular it hides the final return code specified by this option in a place where Tcl scripts cannot access it, and so if a [return -code] is caught before it has caused a procedure body to return then some information will be lost. One consequence of this is that the two procedures defined by

 proc a {} {while 1 {return -code error}}
 package require control
 proc b {} {control::do {return -code error} while 1}

display different behaviour. a returns with an error on the first iteration. b returns without an error on the first iteration. In some sense this difference is a deficiency in control::do, but there is in fact no way to implement it so that it handles this. The language does not support it!

Below are defined some commands that provide a workaround for this. The idea is to have a command [returneval] which not only causes the calling procedure to return but also evaluates a user-supplied command "in place of" the procedure that returned. This lets one define a procedure c through

 eproc c {} {control::do {returneval {error ""}} while 1}

that behaves just like a. In particular,

 catch {a}
 catch {c}

both return 1 whereas

 catch {b}

returns 0.

What one must keep in mind when considering these matters is how Tcl command return codes work in general, and the manner in which the return return code (2) works in particular. It is really quite simple, but as it is also easily mistaken for being magical, many people are confused by it. Most Tcl commands make a distinction only between two classes of return codes: the ok return code (0), and the other return codes. If they get an ok return code from some recursive invocation of the Tcl interpreter then they do what they usually do. If they get any other return code then they return themselves, passing the nonzero return code (and the corresponding return value / error message) back down to whatever called the interpreter that time. The main exception is of course the catch command, since it just returns the return code of the script it recursively invoked as its return value, whereas its return code is the normal 0.

This quick propagation of a nonzero return code is how errors are transferred back to the outermost caller. This is also how return works: the return code of this command is nonzero (2, to be precise) and the commands surrounding a return will therefore quickly pass this along. If it reaches a catch then it will be caught (and the return value of that catch will be 2), but it usually doesn't. What makes the return command different from the error command in normal usage is however that procedure bodies treat the corresponding return codes differently. An error return code (1) is passed along, but a return return code is intercepted and it will usually change to something different. A simple [return] or [return -code ok] will get the return code ok (0) when they meet a procedure body. A [return -code error] will get the return code error (1) when it meets a procedure body. A [return -code break] will get the return code break (3) when it meets a procedure body. And so on.

Something similar happens with the break and continue commands, although the return codes produced by these are intercepted not only by catch and procedure bodies, but also by the loop commands (for, foreach, while, ...). On the other hand, the uplevel command does not intercept any return codes. A dark horse in these contexts is subst, which interprets everything but 1 (error) as 0 (ok), including e.g. 2 (return) and 3 (break), when it does command substitution.

The returneval command makes use of the nonstandard return code -1. The reason for choosing this can be found on the uplevel page.

 proc returneval {script} {return -code -1 $script}

However, this will just behave as an error if it isn't intercepted at some point. Therefore any procedure which you might want to returneval from must be defined using eproc rather than proc.

 proc eproc {name arglist body} {
     uplevel 1 [list proc "$name " $arglist $body]
     set full [uplevel 1 [list namespace which -command "$name "]]
     interp alias {} [string range $full 0 end-1] {} [namespace which -command eproc_call] $full
 }
 proc eproc_call {args} {
     set code [catch [list uplevel 1 $args] res]
     if {$code == -1} then {
         set code [catch [list uplevel 1 $res] res]
         return -code $code $res
     } elseif {$code == 1} then {
         return -code error -errorinfo $::errorInfo -errorcode $::errorCode $res
     } else {
         return -code $code $res
     }
 }

[Also explain why the above works]


About the original definition

 proc eproc {name arglist body} {
    interp alias {} $name {} eproc_call "$name "
    proc "$name " $arglist $body
 }

of eproc, DGP wrote: Very nice. This could possibly be a way to work around the limitations of the control package commands until a TIP 90 solution is in place.

Some nitpicking: it looks like [eproc] assumes it is called from the :: namespace. To make this robust, there should be an [uplevel 1 ::namespace current] to discover the namespace context of the caller, then be careful to create both the alias and the proc in that namespace.

Lars H: I suspect you're right about that. Would it work to simply uplevel the commands in eproc, i.e.,

 proc eproc {name arglist body} {
     uplevel 1 [list interp alias {} $name {} ::eproc_call "$name "]
     uplevel 1 [list proc "$name " $arglist $body]
 }

?

DGP For the [proc], yes. For the [interp alias] no. For no apparently good reason, [proc]s get defined in the current namespace while aliases get defined in the namespace :: of the target interp.

Lars H: (2002-12-02) OK, now I have fixed that namespace issue. As a side-effect, eproc now returns the full name of the alias it created.

(2002-12-12) Another namespace fix, so that eproc and eproc_call doesn't have to be defined in the :: namespace. Another note: The handling of negative return codes is broken in some Tcl 8.4 versions, but that has been fixed in CVS when I write this. If you want to use the above commands, but you have a buggy version, then substitute some positive integer > 4 for the -1 return code in returneval and eproc_call.


Lars H, 2005-04-11: Today I had a "brilliant" idea on how to make returneval safe for tail call optimization, and put the following here on the wiki:

 proc eproc_call {args} {
     if {[info level]>1 &&\
         [lindex [info level 0] 0] eq [lindex [info level 1] 0]}\
     then {return -code -1 $args}
     set code -1
     while {$code == -1} {
         set code [catch [list uplevel 1 $args] args]
     }
     if {$code == 1} then {
         return -code error -errorinfo $::errorInfo -errorcode $::errorCode $args
     } else {
         return -code $code $args
     }
 }

I think the idea is sound: When eproc_call notices it has been called by itself, it can safely pass the call back to the calling eproc_call and let everything be handled there -- thus there will not be a stack buildup, which is what prevented tail calls in the first place. The problem is however that in practice uplevel will hide the outer eproc_call from the inner, so the then branch of the first if above is never taken, and the stack starts piling higher (but out of sight from info level, which fooled me at first). Close, but no cigar.

Unless I get it sorted out somehow, the following should be ignored: Another modification for this is to allow several arguments of returneval:

  proc returneval {args} {
      if {[llength $args]==1} then {set args [lindex $args 0]}
      return -code -1 $args
  }

With these, a tail-recursive factorial fac' can be coded as follows

  proc fac {n {prod 1}} {
      if {$n<=1} then {
          return $prod
      } else {
          returneval fac [expr {$n-1}] [expr {$n*$prod}]
      }
  }