Fabricio Rocha - 08-Feb-2010 - Error treatment in programming always seems to be an underestimated topic, often untold by and to newbies, while it's a useful thing that might be naturally taught along with the basics in a programming language. Only after some two years of studying Tcl/Tk I was able to find some information about this subject and develop myself a very basic and limited idea of how applications can avoid being crashed by bugs or misuse, so I would like to discuss some error management techniques with the experienced folks, while building up a tutorial from this discussion (something highly useful by aspiring Tclers like me). And, please, treat the errors you find...
Whenever a script invokes a command/procedure which is not defined anywhere, the Tcl interpreter triggers a built-in command called unknown. This command searches a definition of the procedure in other places than the interpreter's context, and if a procedure with the name is not found anywhere, unknown stops the script's processing and shows an error message in the console.
Like other Tcl built-in commands, ::unknown can be renamed and substituted by a procedure with the same name which can do other things before the default actions; and this is the way unknown is best used. Since Tcl 8.5, there is also the namespace unknown command, which allows the programmer to name a procedure which will be called when a command/procedure lookup fails in the scope of a specific namespace.
A reserved and global variable called errorCode is automatically created by the Tcl interpreter during the execution of a script for holding information about errors occurred in runtime, so its contents are changed everytime an error happens. errorCode is a variable-length list whose first element is a string which indicates the type of error which happened, and the following elements, if existant, are details about the errors which can be used by a procedure for error treatment. As of Tcl8.5, ::errorCode seems to be still underused by many of the core Tcl commands, and these are the possible values and structures that are generated by these commands and stored in ::errorCode, according to the official documentation [L1 ]:
Any procedure can set its own error values in ::errorCode by using the "advanced" options for the command return, as we will see below.
The return command may have more uses than just giving back a valid or invalid value to a script or procedure which had called the procedure it's in. As often told in many Tcl tutorials and books, the use of a return command in the end of a procedure is always recommended even if no value is actually returned (i.e., like Pascal procedures or C "void functions"), and if the return command is omitted, the procedure will return the result of its last command when it ends. If everything went right, this value will be 0 by default.
For some procedures, returning 1 or 0, "yes" or "no" will be sufficient for telling to their callers that something was done or not. But quite often a procedure must return one of three or more values, and there is no way to tell the caller that a certain returned value is actually the result of an error. Fortunately, Tcl allows the programmer to use special parameters that tell the interpreter and the whole application that something went wrong and the returned value should not be taken as valid. The values passed with these special parameters can be placed in the ::errorCode variable, and they normally tell the interpreter to do something different instead of just going on with the script processing -- the most common action is just halt processing and issue error messages in the console. Here are these parameters:
-code code - This option allows the programmer to purposedly interrupt the procedure and raise an error flag. It can receive the following values:
-errorcode list - If the -code option was set to 1, the ::errorCode variable is, by default, set to NONE. This option allows the programmer to set ::errorCode to the format and values of list, thus providing more details about the error. According to the official documentation, this option is ignored if the -code option is set to any other value than 1 or "error".
-errorinfo string - This option is also valid only if the -code parameter was set to 1 or "error". When omitted, a stack trace, listing the latest procedure calls that happened before the error, is stored at the ::errorInfo global variable. With this option, a more detailed information can be included in ::errorInfo -- for example, what the procedure was trying to do in the moment of the crash. If the stack trace is still of interest, the programmer can retrieve the contents of the ::errorInfo variable, append them to a customized message, then put the whole thing in the -errorinfo option of return.
-options list_of_pairs - These pairs of options/values, with any contents, are simply appended to the other pairs which are given back by return to the procedure caller. This is useful for passing to the caller all the error information which the procedure itself received from a command it had used.
-level number - number is the number of levels up in the procedures stack in which the error code defined by the -code option will be applied. For example, if procedure A calls B, B calls C and C ends with something like return -code error -level 2 "Houston, we have a problem", the error won't be raised in the context of B (which would be the default behaviour), but instead in the context of A. (A good use for this, anyone?)
The catch command is the Tcl way for directly receiving the special values that return may have set in the case of an error. It runs a certain procedure/command under a second instance of the interpreter, which will not crash the application if something goes wrong. If the "catched" procedure ends abnormally -- i.e., its return -code is other than 0 --, catch will return exactly this code. Two variables names can optionally be passed to catch: the first one will receive the value passed by the called procedure's return command (which, in case of an error, is expected to be an explanation of the error) and the second one will hold a dict (which can be processed like a list of pairs) pretty similar to the contents of ::errorCode and to the extra options used in the return command, with the keys -errorcode, -errorinfo and -errorline.
The error command triggers the error-handling measures in the application and/or Tcl interpreter. Before Tcl 8.5 introduced some of the advanced options for the return command, error was the preferred way to intentionally signal an error.
The bgerror command is called when an uncaught error reaches the Tcl/Tk event loop; it gives the application the ability to handle the error in some appropriate way. In GUI applications, it's common to report the error the user and give them the ability to easily send the stack trace and any related information to the developer. In non-GUI applications, it's useful to log the stack trace and related information to a log file; then, the application can either keep running or shutdown, as appropriate.
The infrastructure provided by Tcl allows applications to use exception handling, in the traditional sense of "try to do this, and if something goes wrong tell me and I'll see what can I do". This contrasts to the approach of "errors prediction", which, for example, performs a series of tests on the data which will be passed to a command for checking its validity, before the operation is performed. Both techniques are not excludent, however. Tcl allows various approaches to errors management, with their pros and cons:
1) Always use the advanced return options when writing procedures which can cause or face errors, or which may give back an invalid result;
2) Always use catch for calling commands or your own procedures which can cause or face errors like described in 1;
3) Create a procedure to be called in the case that catch captures an error, for interpreting the error codes and, based on that, show error messages in friendly and standardized dialogs and perform operations which could minimize or solve the error.
Create a trace on ::errorCode, and a procedure to be called everytime it is modified, for interpreting the codes, display them, provide minimization measures, etc.
Any other? Please add what you do!
LV One useful thing that I sometimes use is creation of log files containing information intended to be useful in determining the state of the program during particular points. Sometimes, displaying information about the values of a number of variables is not as helpful as having that information written to a file - for instance, there are times when a GUI application might not have easy access to stderr for error traces. Writing information to a log file, which is available - and perhaps even emailable - to the programmer responsible is helpful.
Failure in files, channels and sockets operations?
Errors caused by invalid inputs. It is often useful to use a distinct error code (e.g., INVALID) for data validation errors, as it makes it possible for the application to distinguish between errors in the user's input and errors in the validation or execution code.
Syntax errors and programming bugs - They'd better be fixed. Sure, but....
LV Certainly they need to be fixed. However, if you hide the info from the user, how will the programmer know what the bug/error is? Unless you have a guaranteed method of getting said info to the programmer (and email doesn't count - the user MIGHT be working off line), then providing the user with sufficent information to a) know what the error is and b) know who to contact or what to do about the problem seems the best approach to me.
Fabricio Rocha - 12-Feb-2010 - One more reason for having a way to intercept and explain this kind of errors to common users is that it seems that any test suite or any test routine will not be able to find some errors that users are able to find. Of course it is not nice to show weaknesses to a final user, but this is something practically unavoidable in software. And in addition to the situations listed by LV, we can consider that, for an open source/free software, providing good information about an error is a way to c) allow a user with sufficient programming knowledge to fix the problem and possibly contribute to the software development.