Version 22 of TIP proposal for Try-Catch Exception Handling

Updated 2008-09-23 10:27:28 by lars_h

TIP proposal: Try/Catch Exception Handling

Draft proposal by Twylite 2008-09-12.

Abstract

This TIP proposes the addition of new core commands to improve the exception handling mechanism. It supercedes TIP #89 by providing support for the error options dict introduced in Tcl 8.5 by TIP #90.

Rationale

See TIP #89 for general rationale for enhancing exception handling.

The [try...catch] syntax presented here is not intended to replace [catch], but to simplify the expression of existing exception/error handling techniques, leading to greater code clarity and less error-prone workarounds for [finally] blocks. There is no deficiency in the functionality of Tcl's exception handling mechanisms - what is lacking is a more readable syntax and a standard for behavior across packages for the common case of catching a subset errors that are thrown from within a particular block of code.

In Tcl 8.4 exceptions could be caught using [catch], and exception information was available via the [catch] return value and resultvar. If the return value was TCL_ERROR (1) then the globals ::errorCode and ::errorInfo would be set according to the exception raised.

TIP #89 was written to work with this model, such that a catch handler (in a try...catch) would be able to capture the resultvar, errorCode and errorInfo.

Tcl 8.5 implements TIP #90 which extends [catch] to allow an additional dict of options (error information) to be captured. These options supersede the ::errorInfo and ::errorCode globals.

It is logical to extend/correct the syntax of TIP #89 to support the options dict in preference to the older mechanism for capturing exception information.

Benefits of adding this functionality to the core:

  • Bring to Tcl a construct commonly understood and widely used in other languages.
  • A standard for identifying categories/classes of errors, which will improve interoperability between packages.
  • A byte-coded implementation would be significantly faster than the Tcl implementation that is presented.

Example of use

Simple example of try/catch/finally logic in Tcl using currently available syntax:

  proc read_hex_file {fname} {
    set f [open $fname "r"]
    set data {}
    set code [catch {
      while { [gets $f line] >= 0 } {
        append data [binary format H* $line] 
      }
    } em opts]
    if { $code != 0 } {
      dict set opts -code 1
      set em "Could not process file '$fname': $em"
    }
    close $f
    return -options $opts $em
  }

And the same example rewritten to use [try...catch...finally]:

  proc read_hex_file {fname} {
    set f [open $fname "r"]
    set data {}
    try  {
      while { [gets $f line] >= 0 } {
        append data [binary format H* $line] 
      }
    } catch {* em} {
      error "Could not process file '$fname': $em"
    } finally {
      close $f
    }  
  }

This illustrates how the intent of the code is more clearly expressed by [try...catch], but does not demonstrate the use of multiple catch blocks.

Specification

  • throw type message

Since the catch handlers in the try...catch control structure will filter based on the exception's errorcode, it makes sense to have a command that will encourage the use of error codes when throwing an exception. [throw] is merely a reordering of the arguments of the [error] command.

type should be constructed as a list to maintain compatibility with ::errorcode, but it is treated as a string by [try...catch].

  • try body ?catch {type ?emvar? ?optvar?} body? ?...? ?finally body?

The try body is evaluated in the caller's scope. If the result is TCL_ERROR then each catch handler is considered in order until one is found with a type that matches the exception's errorcode, then the body of that handler is executed.

Returns the result of the last executed body (but not the finally body). If [try] returns TCL_OK then it will return the result of the try body, otherwise it will return the result of the catch body.

Rules:

  • The type is a glob that is used to match against the exception's errorcode (-errorcode in the options dict, treated as a string).
  • Only one catch handler will be executed. If the type matches for more than one handler then on the first handler (reading left-to-right in the command) will be executed.
  • If no matching handler is found then the exception will propagate up the call stack. All return codes other than TCL_ERROR automatically propagate up the call stack.
  • If the catch body is a "-" then the body of the following catch block will be executed instead.
  • When the handler body is executed the error message will be stored in the emvar (if specified) and the return options dict in the optvar (if specified).
  • If an exception (in fact any return code other than TCL_OK) occurs in a catch block then the new exception takes precedence and will propagate up the stack. The original error stack will be appended to the new errorInfo in order to maintain context.
  • No support is provided for catching return codes other than TCL_ERROR. Support may be added in future via an alternative keyword to catch (say catchcode).

Irrespective of the outcome of the try or catch bodies that are executed, the finally body (if present) will be executed as the last step before the result is propagated up the call stack. If the result of the finally body is anything other than TCL_OK, that result will take precedence.

Changes

  • Twylite 2008-09-22
  • Removed rethrow pending clarification on the use case. A [throw] inside a catch block will preserve the original error stack at the end of the new errorInfo.
  • [try...catch] will return the result of the last executed body block.
  • Clarified issues previously noted for consideration and incorporated feedback from NEM.
  • Added details on finally behaviour

References

  • TIP #89 [L1 ]
  • TIP #90 [L2 ]
  • Tcl 8.4 catch [L3 ]

Discussion

KD: What I'm missing in this is that Tcl has already a mechanism for throwing exceptions, namely special return codes. For example, the following throws an exception with return code 7:

 return -code 7 "User provided incomplete input"

which can be catched somewhere on the call tree with:

 switch [catch {do something} result] {
     0 {do the next thing}
     1 {return -code 1 -errorcode $::errorCode -errorinfo $::errorInfo $result}
     7 {tk_messageBox -type ok -icon error -message $result}
 }

Tip 89 seems to use ::errorCode for everything, but exceptions are not errors, and currently ::errorCode in Tcl is used only for I/O and OS errors. To me it seems better to keep return code 1 for these errors, and use special return codes for user-defined exceptions.

Twylite The [try...catch] syntax is not intended to replace [catch], but to simplify the expression of existing exception/error handling techniques, leading to greater code clarity and less error-prone workarounds for [finally] blocks. There is no deficiency in the functionality of Tcl's exception handling mechanisms - what is lacking is a more readable syntax and a standard for behaviour across packages for the common case of catching a subset errors that are thrown from within a particular block of code.

The Tcl return code is usually used to indicate an exceptional change in program flow - but it does not indicate the cause of that change. The cause is captured in the return options (including error message, errorinfo, errorcode).

My understanding of your suggestion is to use a different return code for each possible "type" of exception. I can't see that providing value for the following reasons:

  • Catch filters will be integers, which are not self documenting. Checking for result "7" is less meaningful than checking for "IOERROR".
  • It is more difficult to filter on a range/class/group of related exceptions when using integers - you could introduce special syntax for "7", or "7,8" or "7-10"; but this is still less powerful and less obvious than "IOERROR *".
  • It is more difficult for different packages to avoid treading on each others' toes when using integers. With string-based exception types it is easy to adopt a convention for unique exception names.
  • Potential to break compatibility with existing extensions: various packages used exceptional return codes for special flow-control purposes.
  • It is hard to use a return-code-based exception mechanism inside control structures. i.e. try { code } catch { any exception but not flow control changes } { ... }. Beyond the Tcl reserved return codes (0-5) you don't know which other codes are intended for use in flow control.

I contend that one return code should be used to indicate an exceptional return condition corresponding to an "exception", and the nature of that exception (used for filtering) should be encoded in the return options.

Which leaves two questions:

  • Which return code should be used? (or: is an exception an error)
  • What option should be used to indicate the nature of the exception?

You assert that "exceptions are not errors", but that is a matter of perception and is often determined by the language. Some distinguish based on compile-time (error) versus runtime (exception) - e.g. PHP, C. To some an error is something necessarily unrecoverable while an exception is just an unexpected transient condition (e.g. Java). To others an error is something local to a routine that prevents it from behaving correctly (like bad arguments) while an exception is an error that originates in subroutines it calls.

The definition of an exception is simply a condition that changes the normal flow of execution, typically indicating that a routine could not execute normally. In systems without SEH this change of flow is normally accomplished with error return codes.

A Tcl error (return -code 1) is exactly that - a change in the normal flow of execution because execution couldn't complete normally. There is also a mechanism to indicate the class/type/nature of the exception: errorCode.

KD: Thanks for the clarifications. In my projects, I've found special return codes the easiest (clearest, most readable) way to distinguish between "real" Tcl errors and user-defined exceptions. But as you say, the reason is that Tcl had no standard syntax for exceptions. With your new syntax there will be no need for special return codes. OK.

I still feel somewhat uneasy about using ::errorCode, mainly because I've found Tcl itself to be not really consistent with it. For example, sometimes the error message itself is repeated in ::errorCode, sometimes not. All to often it is just "NONE", but sometimes undocumented values are returned. (For example, [puts dummy blah] gives errorCode "TCL LOOKUP CHANNEL dummy".)

In any case, errorCode is certainly a list, so string matching should be done on [lindex $::errorCode 0].


NEM: Should try be able to catch non-error exceptions (e.g. break/continue)? Should throw be able to create them? Throw and rethrow need an options dict argument, or take options separately (e.g. -errorinfo). Is type considered to be a string or a list? An exception in a catch handler should abort with error immediately -- don't keep searching through other handlers, as they might throw errors too. I'd say execute only first matching catch handler. Given that type matching is glob-style, the catch handler may also want to capture it so it can determine what exact errorCode matched. Drop the RETHROW errorCode idea - I can't see a good use-case for this. try should return the result of its body. If an error is thrown and caught, then the result of the corresponding catch block should be returned, allowing catch blocks to implement defaults.

Twylite Quick repeat of what I've said above: The [try...catch] syntax is not intended to replace [catch], but to simplify the expression of existing exception/error handling techniques.

  • Should try catch break, continue, etc? No, IMHO. This allows [try..catch] to be used within control structures like for/foreach/while, and causes it not to interfere with third-party control structures that use return codes. Currently to do a [catch] within a loop requires extra logic to handle the case where returncode is > 1. NEM: Right, but if there is no catch pattern corresponding to a break/continue code then it will be transparent anyway (I assume unmatched exceptions propagate normally). The only issue would be how to specify these types of exceptions. Perhaps it can be added later: catch -code break ... Twylite: Uncaught exceptions and non-error return codes propagate normally. I see your point here - [try...catch] could be more flexible by supporting other codes, but at the same time I don't want to make the syntax so complex that it provides no benefit over [catch].
  • Should throw be able to create them? No - irrelevant if [try] should ignore them.
  • Throw needs dict/options: Why? Throw is meant to be a shortcut for "[error] with errorcode". It should always be possible to construct a more complex exception using [error] - adding options to [throw] seems to merely recreate [error]. NEM: But leaving options out might lead to the command being ignored. It is simple to implement: proc throw {code message args} { return -code error -errorcode $code {*}$args $message } Twylite: I accept that the implementation is easy, but I don't see the use case?
  • Is type a string or a list? Type is intended to be compatible with ::errorCode / -errorcode, which is documented to be a list. That said my intention was to use string glob matching on type. My current feeling is that type should be treated as a string by [try...catch], but a catch block can interpret the type however it wants if it has more specific knowledge.
  • I agree that I prefer to execute only the first matching handler. This is (in my reading) a change in behaviour compared to TIP #89.
  • The catch handler may want to capture the type: Agreed, this does seem sensible. Actually, the type is already available to the handler - capture the options and look at the -errorcode. NEM: OK.
  • Rethrow: the use case is error translation. It is common for a module/API to catch certain types of error and translate them into a different (API-specific) type. For example a file or IO error in a logging subsystem would be caught and re-thrown as a LogException. A rethrow would be equivalent to "error -code 1 -errorcode NEW_ERROR_TYPE -errorinfo "$em\n TRANSLATED FROM\n$::errorInfo" $em". (Clarification: rethrow can't be implemented by a special errorcode as was suggested in my proposal, as this would make errorcode translation impossible; an implementation of rethrow will probably require a special return code). NEM: Hmm.. rethrow in most try/catch implementations I've seen simply rethrows the original exception as if the catch had never happened. Translation seems like it can be easily done with just plain throw. Twylite: The distinction is the stack trace, really. If you're familiar with Java, it's the difference between catch+throw and catch+fillInStackTrace()+throw.
  • try should return the result of the body (or the executed catch body): you make a good point, especially regarding catch blocks and defaults. NEM: Yes, this is another argument in favour of only executing a single catch block.

NEM: Finally, my ideal would be that try/catch creates a lambda(s) for catch handlers and these are invoked at the point of error, rather than when the stack has unwound, as this allows for resumes (via break/ continue). This isn't possible to do for errors in general, however (e.g. C code errors).

Twylite This does sound interesting (although it did rather freak me out when I first encountered it in PHP ;) ) - but it is beyond the scope of what I'm trying to accomplish here, since it does introduce completely new behaviour that is distinct from Tcl's existing mechanisms.


Lars H: One thing you could do is to interpret a continue in some catch body as a signal that "I won't handle this, but keep on looking, since maybe some subsequent catch will." There is a sort-of precendence for this (although with the opposite default) in the use of break within Tk binding scripts to stop more generic bindings from firing.

Would it make sense to make throw more distinct (in terms of return code or return options) from error? (Of course, the -level 2 below is a difference.) It feels like an outright error in a catch body might need slightly different treatment than an error being thrown; in the former case, it is important to learn the exact position in the catch body where this error occurred, but in the latter it is the position within the try body that the errorinfo should focus on.


Implementation

First-pass implementation based on features from TIP #90:

  namespace eval ::control {
    # These are not local, since this allows us to [uplevel] a [catch] rather than
    # [catch] the [uplevel]ing of something, resulting in a cleaner -errorinfo:
    variable em {}
    variable opts {}

  }


  proc ::control::throw {type message} {
    return -code error -errorcode $type -errorinfo $message -level 2 $message
  }

  # For future reference: rethrow can be implemented by adding a "-rethrow"
  # key to the return options dict 
  # proc ::control::rethrow {{type {}} {message {}}} {
  #   return -code error -errorcode $type -rethrow 1 $message
  # }


  proc ::control::try {args} {

    # Check parameters
      set try_block [lindex $args 0]
      set has_finally false
      set i 1
      # Check args after try_block - should be zero or more 'catch {spec} body'
      # followed by zero or one 'finally body'
      while { $i < [llength $args] } {
        switch -- [lindex $args $i] {
          "catch" {
            # catch {spec} body
            # spec = {type emvar optsvar} is not checked here, to avoid performance issues
            incr i 3
          }
          "finally" {
            # finally body (and no further handlers)
            set has_finally true
            incr i 2
            break
          }
          default {
            break
          }
        }
      }
      # If we broke out before the last arg then there was a parameter problem
      if { $i != [llength $args] } {
        error "wrong # args: should be \"try body ?catch {type ?emvar? ?optvar?} body? ?...? ?finally body?\""
      }

    # Execute the try_block, catching errors
      variable em
      variable opts
      set code [uplevel 1 [list ::catch $try_block \
        [namespace which -variable em] [namespace which -variable opts] ]]

    # Quickly handle the common case of no errors + no finally
      if { ($code == 0) && ! $has_finally } {
        # Return the result of the try body
        return $em
      }

    # Keep track of the original error message & options
      set _em $em
      set _opts $opts

    # If we got a TCL_ERROR then look for catch blocks (all other return codes
    # propagate to the caller)
      if { $code == 1 } {

        set errorcode [dict get $opts -errorcode]

        # Search catch handlers looking for a match ('type' glob matches errorcode)
        set i 1
        while { [lindex $args $i] eq "catch" } {
          set spec [lindex $args $i+1]
          lassign $spec type emvar optsvar
          incr i 3

          if { ! [string match $type $errorcode] } {
            continue
          }

          # Found a matching catch handler, make error msg/opts available to caller
          if { $emvar ne {} } {
            upvar 1 $emvar _emvar
            set _emvar $em
          }
          if { $optsvar ne {} } {
            upvar 1 $optsvar _optsvar
            set _optsvar $opts
          }

          # Execute catch block
          set catch_block [lindex $args $i-1]
          set code [uplevel 1 [list ::catch $catch_block \
            [namespace which -variable em] [namespace which -variable opts] ]]

          # Handler result replaces the original result (whether success or 
          # failure); capture context of original exception for reference
          dict set opts -during $_opts
          set _em $em
          set _opts $opts

          # Handler has been executed - stop looking for more
          break        
        }

        # No catch handler found -- error falls through to caller
        # OR catch handler executed -- result falls through to caller
      }

    # If we have a finally block then execute it
      if { $has_finally } {
        set finally_block [lindex $args end]
        set code [uplevel 1 [list ::catch $finally_block \
          [namespace which -variable em] [namespace which -variable opts] ]]

        # Finally result takes precedence except on success
        if { $code != 0 } {
          dict set opts -during $_opts
          set _em $em
          set _opts $opts
        }

        # Otherwise our result is not affected
      }

    # Propegate the error or the result of the executed catch body to the caller

      #FIXME -level 2 will hide the try...catch itself from errorInfo, but it
      #  breaks nested 'try { try ... catch } catch' 
      dict incr _opts -level 1

      return -options $_opts $_em
  }


  interp alias {} ::try {} ::control::try
  interp alias {} ::throw {} ::control::throw