Version 4 of Tcl performance: catch vs. info

Updated 2011-03-28 09:18:45 by dkf

Bastien Chevreux wrote in the comp.lang.tcl newsgroup: One thing I discovered is that 'catch' comes with a tremendous cost (of time) if it is triggered. I've always been the lazy kind of programmer, so I frequently used this kind of code:

# initialise 'b' so that we always have valid values (and we don't 
# forget it later on
set b "default"

# do something, eventually creating the variable 'a'
# now, set b to a if it exists
catch { set b $a}

Well ... don't do it (or just use it when you are absolutely sure that it'll trigger only in very rare exceptions). I've attached a small test script which shows that this is approximately 10 times slower than:

if {[info exists a]} {set b $a}

when a does not exist. Only when the catch does not trigger, it is approximately 2 times faster than the if-variant.

#!/bin/sh
# \
        exec tclsh "$0" ${1+"$@"}
 
proc doinfo_fail {} {
    set b "default"
    for {set i 0} {$i < 1000000} {incr i} {
        if {[info exists a]} {set b $a}
    }
    return $b
}
proc doinfo_ok_local {} {
    set a "xxxxxxx"
    set b "default"
    for {set i 0} {$i < 1000000} {incr i} {
        if {[info exists a]} {set b $a}
    }
    return $b
}
proc doinfo_ok_global {} {
    global a
    set b "default"
    
    for {set i 0} {$i < 1000000} {incr i} {
        if {[info exists a]} {set b $a}
    }
    return $b
}
proc docatch_fail {} {
    set b "default"
    for {set i 0} {$i < 1000000} {incr i} {
        catch {set b $a}
    }
    return $b
}
proc docatch_ok_local {} {
    set b "default"
    set a "xxxxxxx"
    for {set i 0} {$i < 1000000} {incr i} {
        catch {set b $a}
    }
    return $b
}
proc docatch_ok_global {} {
    global a
    set b "default"
    for {set i 0} {$i < 1000000} {incr i} {
        catch {set b $a}
    }
    return $b
}
set a "xxxxxxx"

puts "doinfo_fail         [time {doinfo_fail} 1]"
puts "doinfo_ok_local     [time {doinfo_ok_local} 1]"
puts "doinfo_ok_global    [time {doinfo_ok_global} 1]"
puts "docatch_fail        [time {docatch_fail} 1]"
puts "docatch_ok_local    [time {docatch_ok_local} 1]"
puts "docatch_ok_global   [time {docatch_ok_global} 1]"

exit 0

Donal Fellows replied: My performance figures for your script (using the current CVS HEAD version of Tcl under Solaris on an Ultra-5) can be seen below:

  doinfo_fail          7957618 microseconds per iteration
  doinfo_ok_local      8789728 microseconds per iteration
  doinfo_ok_global    11462929 microseconds per iteration
  docatch_fail        67244385 microseconds per iteration
  docatch_ok_local     3634439 microseconds per iteration
  docatch_ok_global    5126791 microseconds per iteration

What these demonstrate is that in performance-sensitive code it is important to code for the "normal" case; where you expect the code to normally succeed and only occasionally fail, there's actually quite a gain to be had from using catch, but you definitely take a performance hit in the failure case for doing this.


1 March 2001: Kevin Kenny adds:

Let's try to quantify the effect a bit more. Let's take one of the procedures from Counting Elements in a List, and code it up with both info exists and catch.

proc count1 { list countArray } {
     upvar 1 $countArray count
     foreach item $list {
         if { [catch { incr count($item) }] } {
             set count($item) 1
         }
     }
     return
}
 
proc count2 { list countArray } {
     upvar 1 $countArray count
     foreach item $list {
         if { [info exists count($item)] } {
             incr count($item)
         } else {
             set count($item) 1
         }
     }
     return
}

Now we can wrap a little benchmark program around the two procedures:

puts {  List  |          | [info }
puts { Length | [catch]  |  exists]}
puts { -------+----------+----------}
 
set list [list apple]
set i 1
 
while { $i <= 20 } {
     set n [expr { 20000 / $i }]

     set c1 [clock clicks -milliseconds]
     for { set j 0 } { $j < $n } { incr j } {
         count1 $list total
         unset total
     }
     set t1 [expr { [clock clicks -milliseconds] - $c1 }]
     set t1 [expr { $t1 / double($n) }]

     set c2 [clock clicks -milliseconds]
     for { set j 0 } { $j < $n } { incr j } {
         count2 $list total
         unset total
     }
     set t2 [expr { [clock clicks -milliseconds] - $c2 }]
     set t2 [expr { $t2 / double($n) }]
 
     puts [format " %6d | %8.3f | %8.3f " $i $t1 $t2]
     flush stdout
 
     incr i 1
     lappend list apple
}

This program tells us the time taken by both methods for various list lengths. THe following numbers are off a 550 MHz PIII running 8.3.2:

  List  |          | [info 
 Length | [catch]  |  exists]
 -------+----------+----------
      1 |    0.077 |    0.045 
      2 |    0.088 |    0.059 
      3 |    0.095 |    0.071 
      4 |    0.104 |    0.082 
      5 |    0.110 |    0.095 
      6 |    0.120 |    0.105 
      7 |    0.126 |    0.119 
      8 |    0.136 |    0.128 
      9 |    0.135 |    0.144 
     10 |    0.145 |    0.150 
     11 |    0.155 |    0.165 
     12 |    0.157 |    0.174 
     13 |    0.176 |    0.183 
     14 |    0.175 |    0.211 
     15 |    0.180 |    0.210 
     16 |    0.193 |    0.224 
     17 |    0.196 |    0.230 
     18 |    0.207 |    0.253 
     19 |    0.219 |    0.247 
     20 |    0.221 |    0.270 

This is a pretty typical example of the behavior of [catch] versus [info exists] as the success-failure ratio goes up.

A good rule of thumb is: If you expect the variable to exist at least 90% of the time, use [catch]. If the variable will fail to exist 10% of the time or more, use [info exists].