[Richard Suchenwirth] 2008-07-16 - The following little script demonstrates how performance data for a program (number how often a command is called, average time) can be collected via [trace execution]. You can specify a number of commands (procs or built-ins) for which at start and end, the microseconds timestamp is taken. The difference between the two is added to an array indexed by command name. I first tried just asking for profiling [[[info commands]], but that took so much time that I couldn't stand it. See discussion below. ====== #---------------------------------------- begin package profiler package require Tcl 8.5 namespace eval profiler {variable T; variable S} proc profiler::enter args { variable S ::profiler::push S [::tcl::clock::microseconds] } proc profiler::leave {str args} { variable T; variable S set t0 [::profiler::pop S] ::profiler::push T([lindex $str 0]) [expr {[::tcl::clock::microseconds]-$t0}] } proc profiler::do {what cmdlist} { set cmd "" foreach i $cmdlist { if {!([string match *profiler::* $i])} { append cmd "trace $what execution $i enter ::profiler::enter\n" append cmd "trace $what execution $i leave ::profiler::leave\n" } } eval $cmd } interp alias {} profiler::push {} lappend proc profiler::pop _stack { upvar 1 $_stack stack K [lindex $stack end] [set stack [lrange $stack 0 end-1]] } proc profiler::K {a b} {set a} proc profiler::stats list { set sum [expr double([join $list +])] format "%10.0f %6d %10.2f" $sum [llength $list] [expr {$sum/[llength $list]}] } proc profiler::report {{channel stdout}} { variable T foreach i [lsort -dic [array names T]] { puts $channel [format "%-20s %s" $i [stats $T($i)]] } } #---------------------------------------- end package profiler ====== Here we build our "test bed", a main and a very simple function, which we then instrument: ====== proc main argv { foreach i {1 2 3 4} {puts ==[f $i]} } proc f x {expr {$x*$x}} #-- We activate profiling on all procs, and a few selected Tcl commands: profiler::do add [concat [info procs] {expr lindex foreach upvar puts clock}] main $argv profiler::do remove [info commands] ;# no error if more removed than added profiler::report ====== This is seen on [stdout] of my desktop machine: ======none ==1 ==4 ==9 ==16 expr 137662 84 1638.83 f 29036 4 7259.00 foreach 14879 1 14879.00 lindex 105756 116 911.69 main 15969 1 15969.00 puts 34906 4 8726.50 upvar 134692 92 1464.04 ====== I noticed that it is dangerous to trace some essential commands. Adding [set] to the above list slowed the process down very much, and brought these ugly figures: ======none expr 51470212 654 78700.63 f 12083216 4 3020804.00 foreach 7130413 1 7130413.00 lindex 41924889 848 49439.73 main 8306913 1 8306913.00 puts 15131320 4 3782830.00 set 43121372 906 47595.33 upvar 51070069 722 70734.17 ====== Not a pretty sight. If someone can explain, I'd gladly listen :^) [Lars H]: Does it make a difference if you change the definition of [K] to ====== proc profiler::K {a b} {return $a} ====== ? Get rid of the ''t0'' variable in profiler::leave? It seems every call to this procedure does 3 [set]s, 2 [lindex]es, and only one each of all other commands, which could make a difference if the profiler spends a lot of time profiling itself. I think traces on a command are disabled inside the trace procedure for that command, but this is on a per-command basis, so traces on all traced commands in profiler::leave will fire before trace-of-trace recursion stops. It could simply be that adding [set] to the mix is what is required to make the recursive tracing overhead dominate the I/O latency in [puts]. [RS] [K] can be dropped altogether, I just used it to avoid the ''set res ...; set stack ...; return $res'' pattern. Less fancy, but probably more efficient: ====== set res [lindex $stack end] set stack [lrange $stack 0 end-1] return $res ====== But t0 is needed as intermediate variable - otherwise I get negative times... Wiki issue: the tables above were reformatted to alignment by me several times, but it looks as if browsers tabify spaces, so it came up unaligned again :( [Lars H]: It's not the browser, AFAICT ([sockspy] is your friend), but the checkout routine (that gives you a http://wiki.tcl.tk/_edit/ page) that for some bizarre reason converts all stretches of eight spaces to tabs. (I meant to file a bug on the matter several months ago, but the bug tracking system on google is so awkward, at least IMHO.) Regarding the need for t0, it sounds as though you have a race condition (or something): The item you pop off S is not always the one that was at the top of S when you called profiler::pop? If so, this error is not prevented by having the t0 variable, but it becomes less noticable; rearranging the terms as ====== proc profiler::leave {str args} { variable T; variable S ::profiler::push T([lindex $str 0]) [expr {-[::profiler::pop S] + [::tcl::clock::microseconds]}] } ====== should have the same effect (no negative timinigs, but some very small positive timings). This bug should only hit if you trace commands called from the traces, however. Maybe it is better to just lappend all data (time, enter/leave, command name) to a large list and not compute anything until you've done? This should reduce overhead, and also prevent "race condition" errors. [FF] 2008-07-17 - That's how [(another) command profiler] does. (although, pushing execution times in a stack, produces a wrong total time if commands are nested; it needs some tweaking) ---- [DKF]: Here's another way to do it: ====== package require Tcl 8.5 trace add execution source enterstep timerEnter trace add execution source leavestep timerLeave variable timerStack {} proc timerEnter {cmd op} { variable timerStack lappend timerStack [clock microseconds] } proc timerLeave {cmd code result op} { variable timerStack set now [clock microseconds] set then [lindex $timerStack end] set timerStack [lrange $timerStack 0 end-1] if {![llength $timerStack]} { puts "\t\t\t[expr {$now-$then}]: $cmd" } } source [set argv [lassign $argv argv0];set argv0] ====== Then you can use that on a whole script like this: ======none tclsh8.5 timer.tcl yourScript.tcl arguments as normal ====== <> Performance | Example