Profiling Tcl by Overloading Proc

George Peter Staplin 2005-12-28: I wanted the ability to profile my whole program without wrapping many commands with time, so I created a version of proc that logs the time. The cost of the logging and tracing increases the time, but it seems to be a good way to find hot spots in a program.


# By George Peter Staplin
# Dec 2005 
array set ::debug_state {}

proc debug_callback {args} {
    global debug_state
   
    set proc [lindex [lindex $args 0] 0]
    set type [lindex $args end]
   
    if {"enter" eq $type} {
        set debug_state($proc) [clock clicks]
    } else {
        set end [clock clicks]
        set fd [open $::debug_log_file a]
        #catches to stop failure during recursive calls  - !todo: proper handling of recursion
        catch {puts $fd "$proc took: [expr {$end - $::debug_state($proc)}]"}
        close $fd
        catch {unset debug_state($proc)}
    }
}

proc debug_trace cmd {
    trace add execution $cmd enter debug_callback
    trace add execution $cmd leave debug_callback
}

set ::debug_log_file debug.log
rename proc _proc


_proc proc {name arglist body} {
    if {![string match ::* $name]} {
        #not already an 'absolute' namespace path
        #qualify it so that traces can find it
        set name [uplevel 1 namespace current]::[set name]
    }
    _proc $name $arglist $body
   
    debug_trace $name
}

Possible future improvements might include keeping a log of the entire runtime, and then averaging the runtime by the number of calls. I however didn't need that for my usage, because I was killing the program with ^C. :)


JMN 2005-12-30: Added kludgy 'catch' statements around ::debug_state array access so that recursive calls to a proc don't stop the program. Also replaced the _proc procedure with a version that works for creating procedures in 'other' namespaces. e.g The following script would fail before:

namespace eval ::a {}
namespace eval ::b {proc ::a::myproc {} {return a-myproc}} 

OLD VERSION:

_proc proc {name arglist body} {
    set ns [uplevel 1 namespace current]
    set p [set ns]::[set name]
    _proc $p $arglist $body
   
    debug_trace $p
}

George Peter Staplin 2005-12-30: Thank you JMN. You have given me some ideas for improvements that would handle recursion in a better manner. I will most likely post a followup to this code based on your improvements later.


Barney Blankenship 2006-06-15: Thanks George, and JMN, here is my contribution:

#=================================================================
# TIME PROFILER
# by [Barney Blankenship] (based on work by [George Peter Staplin])
# 
# Insert this snippet above the function definitions you want
# to have profiled.
#
# TO INITIALIZE OR CLEAR/RESET THE PROFILER...
# global TimeProfilerMode
# if { [info exists TimeProfilerMode] } {
#      global ProfilerArray
#      array unset ProfilerArray
# }
#
# TO PRODUCE THE OUTPUT (currently hard-coded to "TimingDump.txt"
# file output "append" in the current working directory)...
# global TimeProfilerMode
# if { [info exists TimeProfilerMode] } {
#      TimeProfilerDump description
# }
# (description: text string shown at the top of the output)
#
# PROFILING DATA COLLECTION
# (This describes what is included in the output)
# Provides total elapsed time in milliseconds between reset and dump.
# Provides function call statistics...
# for each function defined after this snippet, provide...
#   Number of times called
#   Average milliseconds per call
#   Maximum milliseconds call time
#   Minimum milliseconds call time
#   Total milliseconds used
#   Ratio of above to total elapsed time (XX.XXX percent)
# In addition, the function call statistics are sorted
# in descending values of Ratio (above).
#
# Note that nested functions and functions that use
# recursion are provided for and timed properly.
#
# TO DISABLE PROFILING WITHOUT REMOVING THE PROFILER
# Comment out the "set TimeProfilerMode 0" below...
#=================================================================
global TimeProfilerMode
set TimeProfilerMode 0

if { [info exists TimeProfilerMode] } {
    proc TimeProfiler {args} {
        global ProfilerArray
        
        # Intialize the elapsed time counters if needed...
        if { ![info exists ProfilerArray(ElapsedClicks)] } {
            set ProfilerArray(ElapsedClicks) [expr double([clock clicks])]
            set ProfilerArray(Elapsedms) [expr double([clock clicks -milliseconds])]
        }
        
        set fun [lindex [lindex $args 0] 0]
        
        if { [lindex $args end] == "enter" } {
            # Initalize the count of functions if needed...
            if { ![info exists ProfilerArray(funcount)] } {
                set ProfilerArray(funcount) 0
            }
            
            # See if this function is here for the first time...
            for { set fi 0 } { $fi < $ProfilerArray(funcount) } { incr fi } {
                if { [string equal $ProfilerArray($fi) $fun] } {
                    break
                }
            }
            if { $fi == $ProfilerArray(funcount) } {
                # Yes, function first time visit, add...
                set ProfilerArray($fi) $fun
                set ProfilerArray(funcount) [expr $fi + 1]
            }
            
            # Intialize the "EnterStack" if needed...
            if { ![info exists ProfilerArray(ES0)] } {
                set esi 1
            } else {
                set esi [expr $ProfilerArray(ES0) + 1]
            }
            # Append a "enter clicks" and "enter function name index" to the EnterStack...
            set ProfilerArray(ES0) $esi
            set ProfilerArray(ES$esi) [clock clicks]
            # Note: the above is last thing done so timing start is closest to
            # function operation start as possible.
        } else {
            # Right away stop timing...
            set deltaclicks [clock clicks]
            
            # Do not bother if TimeProfilerDump wiped the ProfilerArray
            # just prior to this "leave"...
            if { [info exists ProfilerArray(ES0)] } {
                # Pull an "enter clicks" off the EnterStack...
                set esi $ProfilerArray(ES0)
                set deltaclicks [expr $deltaclicks - $ProfilerArray(ES$esi)]
                incr esi -1
                set ProfilerArray(ES0) $esi
                
                # Correct for recursion and nesting...
                if { $esi } {
                    # Add our elapsed clicks to the previous stacked values to compensate...
                    for { set fix $esi } { $fix > 0 } { incr fix -1 } {
                        set ProfilerArray(ES$fix) [expr $ProfilerArray(ES$fix) + $deltaclicks]
                    }
                }
                
                # Intialize the delta clicks array if needed...
                if { ![info exists ProfilerArray($fun,0)] } {
                    set cai 1
                } else {
                    set cai [expr $ProfilerArray($fun,0) + 1]
                }
                
                # Add another "delta clicks" reading...
                set ProfilerArray($fun,0) $cai
                set ProfilerArray($fun,$cai) $deltaclicks
            }
        }
    }
    
    proc TimeProfilerDump {description} {
        global ProfilerArray
        
        # Stop timing elapsed time and calculate conversion factor for clicks to ms...
        set EndClicks [expr {double([clock clicks]) - $ProfilerArray(ElapsedClicks)}]
        set Endms [expr {double([clock clicks -milliseconds]) - $ProfilerArray(Elapsedms)}]
        set msPerClick [expr $Endms / $EndClicks]
        
        # Visit each function and generate the statistics for it...
        for { set fi 0 ; set PerfList "" } { $fi < $ProfilerArray(funcount) } { incr fi } {
            set fun $ProfilerArray($fi)
            if { ![info exists ProfilerArray($fun,0)] } {
                continue
            }
            for { set max -1.0 ; set min -1.0 ; set ctotal 0.0 ; set cai 1 } { $cai <= $ProfilerArray($fun,0) } { incr cai } {
                set clicks $ProfilerArray($fun,$cai)
                set ctotal [expr {$ctotal + double($clicks)}]
                if { $max < 0 || $max < $clicks } {
                    set max $clicks
                }
                if { $min < 0 || $clicks < $min } {
                    set min $clicks
                }
            }
            set cavg [expr {$ctotal / double($ProfilerArray($fun,0))}]
            set ProfilerArray($fun,avgms) [expr $cavg * $msPerClick]
            set ProfilerArray($fun,totalms) [expr $ctotal * $msPerClick]
            set ProfilerArray($fun,ratio) [expr {double($ctotal / $EndClicks) * 100.0}]
            set ProfilerArray($fun,max) [expr $max * $msPerClick]
            set ProfilerArray($fun,min) [expr $min * $msPerClick]

            # Append to the sorting list the pairs of ratio values and function indexes...
            lappend PerfList [list $ProfilerArray($fun,ratio) $fi]
        }
        # Sort the profile data by Ratio...
        set PerfList [lsort -real -decreasing -index 0 $PerfList]
        
        # Finally, generate the results...
        set fd [open "TimingDump.txt" a]
        puts $fd "\n===================================================================="
        puts $fd [format "     T I M I N G  D U M P  <%s>" $description]
        puts $fd [format "\n      Elapsed time: %.0f ms" $Endms]
        puts $fd [format "\n      %s" [clock format [clock seconds]]]
        puts $fd "===================================================================="
        for { set li 0 } { $li < [llength $PerfList] } { incr li } {
            set fun $ProfilerArray([lindex [lindex $PerfList $li] 1])
            puts $fd [format ">>>>> FUNCTION: %s" $fun]
            puts $fd [format "       CALLS: %d" $ProfilerArray($fun,0)]
            puts $fd [format "    AVG TIME: %.3f ms" $ProfilerArray($fun,avgms)]
            puts $fd [format "    MAX TIME: %.3f ms" $ProfilerArray($fun,max)]
            puts $fd [format "    MIN TIME: %.3f ms" $ProfilerArray($fun,min)]
            puts $fd [format "  TOTAL TIME: %.3f ms" $ProfilerArray($fun,totalms)]
            puts $fd [format "       RATIO: %.3f%c\n" $ProfilerArray($fun,ratio) 37]
        }
        close $fd
        
        # Reset the world...
        array unset ProfilerArray
    }
    
    #=================================================================
    # Overload "proc" so that functions defined after
    # this point have added trace handlers for entry and exit.
    # [George Peter Staplin]
    #=================================================================
    rename proc _proc
    
    _proc proc {name arglist body} {
                                    #===================================        
                                    # Allow multiple namespace use [JMN]
                                    if { ![string match ::* $name] } {
                                        # Not already an 'absolute' namespace path,
                                        # qualify it so that traces can find it...
                                        set name [uplevel 1 namespace current]::[set name]
                                    }
                                    #===================================
                                    
                                    _proc $name $arglist $body
                                    trace add execution $name enter TimeProfiler
                                    trace add execution $name leave TimeProfiler
                                }
}

Here is the time profiler output on the Piechart Disk program scan of G: drive on my PC...

====================================================================
    T I M I N G  D U M P  <Piecart Disk: G:/>

     Elapsed time: 33062 ms

     Fri Jun 16 11:38:28 PM Hawaiian Standard Time 2006
====================================================================
>>>>> FUNCTION: ReadDirectory
      CALLS: 2281
   AVG TIME: 10.929 ms
   MAX TIME: 3527.845 ms
   MIN TIME: 0.144 ms
 TOTAL TIME: 24929.068 ms
      RATIO: 75.401%

>>>>> FUNCTION: PackAndSort
      CALLS: 2270
   AVG TIME: 1.372 ms
   MAX TIME: 501.295 ms
   MIN TIME: 0.166 ms
 TOTAL TIME: 3114.665 ms
      RATIO: 9.421%

>>>>> FUNCTION: GetGlob
      CALLS: 2281
   AVG TIME: 1.175 ms
   MAX TIME: 84.923 ms
   MIN TIME: 0.325 ms
 TOTAL TIME: 2679.374 ms
      RATIO: 8.104%

>>>>> FUNCTION: Dolsort
      CALLS: 2270
   AVG TIME: 0.572 ms
   MAX TIME: 205.545 ms
   MIN TIME: 0.024 ms
 TOTAL TIME: 1297.803 ms
      RATIO: 3.925%

>>>>> FUNCTION: DirDataMagic
      CALLS: 1
   AVG TIME: 494.874 ms
   MAX TIME: 494.874 ms
   MIN TIME: 494.874 ms
 TOTAL TIME: 494.874 ms
      RATIO: 1.497%

>>>>> FUNCTION: PlotPiechart
      CALLS: 1
   AVG TIME: 181.087 ms
   MAX TIME: 181.087 ms
   MIN TIME: 181.087 ms
 TOTAL TIME: 181.087 ms
      RATIO: 0.548%

>>>>> FUNCTION: ScanProgressTask
      CALLS: 31
   AVG TIME: 5.067 ms
   MAX TIME: 17.719 ms
   MIN TIME: 4.329 ms
 TOTAL TIME: 157.068 ms
      RATIO: 0.475%

>>>>> FUNCTION: OneSecondProgress
      CALLS: 4553
   AVG TIME: 0.025 ms
   MAX TIME: 1.853 ms
   MIN TIME: 0.019 ms
 TOTAL TIME: 113.268 ms
      RATIO: 0.343%

>>>>> FUNCTION: ListDirectory
      CALLS: 1
   AVG TIME: 89.746 ms
   MAX TIME: 89.746 ms
   MIN TIME: 89.746 ms
 TOTAL TIME: 89.746 ms
      RATIO: 0.271%

>>>>> FUNCTION: FormatBytes
      CALLS: 40
   AVG TIME: 0.068 ms
   MAX TIME: 0.122 ms
   MIN TIME: 0.023 ms
 TOTAL TIME: 2.729 ms
      RATIO: 0.008%

>>>>> FUNCTION: FormatCommas
      CALLS: 31
   AVG TIME: 0.046 ms
   MAX TIME: 0.086 ms
   MIN TIME: 0.039 ms
 TOTAL TIME: 1.438 ms
      RATIO: 0.004%

>>>>> FUNCTION: GetColor
      CALLS: 6
   AVG TIME: 0.061 ms
   MAX TIME: 0.168 ms
   MIN TIME: 0.028 ms
 TOTAL TIME: 0.366 ms
      RATIO: 0.001%

Barney Blankenship 2006-06-17: Added MAX and MIN function call time measurements, updated the Time Profiler snippet and example output here.


Barney Blankenship 2006-06-18: Oh My God! The new beta at ActiveState causes piechart.tcl to run 116% faster. I must somehow find a way to wrap with it!