[Fabricio Rocha] - 08-Feb-2010 - Error treatment in programming always seems to be an underestimated topic, often untold by and to newbies, while it's a useful thing that might be naturally taught along with the basics in a programming language. Only after some two years of studying Tcl/Tk I was able to find some information about this subject and develop myself a very basic and limited idea of how applications can avoid being crashed by bugs or misuse, so I would like to discuss some error management techniques with the experienced folks, while building up a tutorial from this discussion (something highly useful by aspiring Tclers like me). And, please, treat the errors you find...


**Which error-management features are provided by Tcl?**


***unknown***

Whenever a script invokes a command/procedure which is not defined anywhere, the Tcl interpreter triggers a built-in command called [unknown]. This command searches a definition of the procedure in other places than the interpreter's context, and if a procedure with the name is not found anywhere, [unknown] stops the script's processing and shows an error message in the console.

Like other Tcl built-in commands, ::unknown can be renamed and substituted by a procedure with the same name which can do other things before the default actions; and this is the way unknown is best used. Since Tcl 8.5, there is also the [namespace unknown] command, which allows the programmer to name a procedure which will be called when a command/procedure lookup fails in the scope of a specific [namespace].


***::errorCode***

A reserved and global variable called [errorCode] is automatically created by the Tcl interpreter during the execution of a script for holding information about errors occurred in runtime, so its contents are changed everytime an error happens. `errorCode` is a variable-length list whose first element is a string which indicates the type of error which happened, and the following elements, if existant, are details about the errors which can be used by a procedure for error treatment. As of Tcl8.5, `::errorCode` seems to be still underused by many of the core Tcl commands, and these are the possible values and structures that are generated by these commands and stored in `::errorCode`, according to the official documentation [http://www.tcl.tk/man/tcl8.5/TclCmd/tclvars.htm]:

   * “ARITH” ''code msg'' – Arithmetic error. The ''code'' element can contain the strings DIVZERO, DOMAIN, OVERFLOW or IOVERFLOW. ''msg'' contains a human-readable description of the problem.

   * “CHILDKILLED” ''pid sigName msg''
   * “CHILDSUSP” ''pid sigName msg'' – Those errors are related to the use of processes in the underlying OS shell by the Tcl interpreter; more specifically, they contain information about processes which were unexpectedly terminated or suspended. ''pid'' is the process identifier; ''sigName'' is the signal which caused the process end or suspension; ''msg'' is a human-readable explanation of the problem. The list of possible values for ''sigName'' is in the system's C standard library ''signal.h'' header file (''TODO: list them here'').

   * “CHILDSTATUS” ''pid code'' – These values are set when an external program used by a Tcl script ends with non-zero value, which is considered an abnormal end. In such cases, the second element of `::errorCode` will contain the process identifier number and the third one will hold the process "exit code". Actually, some system utilities intended for use in pipe sequences exit non-zero values as the correct result of their operations, so the ''code'' value may be the real and valid result of the child process. 

   * “POSIX” ''errName msg'' – Lots of commands which depend on OS-provided functionalities, like file and [socket] operations, can result in errors of this family. The possible values for the ''errName'' item are listed in the ''errno.h'' header file of the C standard library (TODO: list them here). There is some contestation about the precision of these error reports, mainly under Windows, which is not exactly POSIX-compliant.

   * “NONE” - This single value in a one-element `::errorCode` is set when a procedure generates an error -- intentionally or not -- but no detailed information is given about this error.


Any procedure can set its own error values in `::errorCode` by using the "advanced" options for the command `return`, as we will see below.


***return***

The [return] command may have more uses than just giving back a valid or invalid value to a script or procedure which had called the procedure it's in. As often told in many Tcl tutorials and books, the use of a `return` command in the end of a procedure is always recommended even if no value is actually returned (i.e., like Pascal procedures or C "void functions"), and if the `return` command is omitted, the procedure will return the result of its last command when it ends. If everything went right, this value will be 0 by default.

For some procedures, returning 1 or 0, "yes" or "no" will be sufficient for telling to their callers that something was done or not. But quite often a procedure must return one of three or more values, and there is no way to tell the caller that a certain returned value is actually the result of an error. Fortunately, Tcl allows the programmer to use special parameters that tell the interpreter and the whole application that something went wrong and the returned value should not be taken as valid. The values passed with these special parameters can be placed in the ::errorCode variable, and they normally tell the interpreter to do something different instead of just going on with the script processing -- the most common action is just halt processing and issue error messages in the console. Here are these parameters:

'''-code''' ''code'' - This option allows the programmer to purposedly interrupt the procedure and raise an error flag. It can receive the following values:

   * '''0 or "ok"''' - This is the default value assumed when the option is not present. It means that the value passed with the `return` command is valid and script parsing can continue normally.

   * '''1 or "error"''' - Indicates that an error happened and the returned value can not be considered valid -- in fact, it should be considered as a string which explains the error. This code raises an error in the interpreter just like the [error] command, and if there is no proper handling of this error state, the interpreter stops parsing the script and shows error messages in the console.

   * '''2 or "return"''' - Has the effect of causing a `return` without arguments in the caller's context, the upper level in the procedures stack. (''Can someone provide some examples of situations where this is useful?'')

   * '''3 or "break"''' - Mainly used by the Tcl commands which provide loops; has the effect of a [break] command issued in the caller's context.

   * '''4 or "continue"''' - Also used basically by the Tcl commands themselves; has the same effect of a [continue] command in the calling context.
   
'''-errorcode''' ''list'' - If the `-code` option was set to 1, the `::errorCode` variable is, by default, set to NONE. This option allows the programmer to set `::errorCode` to the format and values of ''list'', thus providing more details about the error. According to the official documentation, this option is ignored if the `-code` option is set to any other value than 1 or "error".

'''-errorinfo''' ''string'' - This option is also valid only if the `-code` parameter was set to 1 or "error". When omitted, a stack trace, listing the latest procedure calls that happened before the error, is stored at the `::errorInfo` global variable. With this option, a more detailed information can be included in `::errorInfo` -- for example, what the procedure was trying to do in the moment of the crash. If the stack trace is still of interest, the programmer can retrieve the contents of the `::errorInfo` variable, append them to a customized message, then put the whole thing in the `-errorinfo` option of `return`.

'''-options''' ''list_of_pairs'' - These pairs of options/values, with any contents, are simply appended to the other pairs which are given back by `return` to the procedure caller. This is useful for passing to the caller all the error information which the procedure itself received from a command it had used.

'''-level''' ''number'' - ''number'' is the number of levels up in the procedures stack in which the error code defined by the `-code` option will be applied. For example, if procedure A calls B, B calls C and C ends with something like `return -code error -level 2 "Houston, we have a problem"`, the error won't be raised in the context of B (which would be the default behaviour), but instead in the context of A. (''A good use for this, anyone?'')


***catch***

The [catch] command is the Tcl way for directly receiving the special values that [return] may have set in the case of an error. It runs a certain procedure/command under a second instance of the interpreter, which will not crash the application if something goes wrong. If the "catched" procedure ends abnormally -- i.e., its return `-code` is other than 0 --, `catch` will return exactly this code. Two variables names can optionally be passed to `catch`: the first one will receive the value passed by the called procedure's `return` command (which, in case of an error, is expected to be an explanation of the error) and the second one will hold a [dict] (which can be processed like a list of pairs) pretty similar to the contents of ::errorCode and to the extra options used in the `return` command, with the keys `-errorcode`, `-errorinfo` and `-errorline`.

***error***
The [error] command triggers the error-handling measures in the application and/or Tcl interpreter. Before Tcl 8.5 introduced some of the advanced options for the [return] command, `error` was the preferred way to intentionally signal an error.


***bgerror***
(''I can't really understand the use of bgerror. Could someone please explain what is it for, and how can it be used?'')

**How to use all this stuff?**
The infrastructure provided by Tcl allows applications to use [exception handling], in the traditional sense of "try to do this, and if something goes wrong tell me and I'll see what can I do". This contrasts to the approach of "errors prediction", which, for example, performs a series of tests on the data which will be passed to a command for checking its validity, before the operation is performed. Both techniques are not excludent, however. Tcl allows various approaches to errors management, with their pros and cons:

*** Approach 1: return, catch and process the error ***
1) Always use the advanced `return` options when writing procedures which can cause or face errors, or which may give back an invalid result;
2) Always use `catch` for calling commands or your own procedures which can cause or face errors like described in 1;
3) Create a procedure to be called in the case that `catch` captures an error, for interpreting the error codes and, based on that, show error messages in friendly and standardized dialogs and perform operations which could minimize or solve the error.

*** Approach 2: tracing ::errorCode***
Create a [trace] on `::errorCode`, and a procedure to be called everytime it is modified, for interpreting the codes, display them, provide minimization measures, etc.


''Any other? Please add what you do!''


***Which errors shall be told to the user?***


***Which errors shall NOT be told to the user?***


<<categories>>Enter Category Here