'''How to Measure Performance'''


** See Also **

   [Profiling Tcl]:   

   [time]:   

   [timers].tcl:   [PSE]: Allow me to shamelessly plug my package 


** Description **

Before anything else '''`[time]`''' it! For newer revisions of Tcl 8.6, the http://www.tcl-lang.org/man/tcl8.6/TclCmd/timerate.htm%|%timerate%|% command is preferable. See https://core.tcl-lang.org/tips/doc/trunk/tip/527.md%|%TIP 527%|%.

The wonderful `[time]` command can help you answer
questions like "Is this code faster than that code?"  Check out the
man page.  Or check out this example of [Counting Elements in a List]
which includes a basic timing framework.

If you need to record time stamps (e.g., event driven
Tk applications), then take a look at `[clock clicks]` which
gives you a time stamp of the finest granularity available on the [platform].

Sometimes you may want to '''profile''' an application to find
out where all the time is spent.  There are several solutions.
See [Profiling Tcl] for some suggestions.

One idea is to build in time routines into an extension or application.  For instance, `md5pure::time` lets anyone time the [md5pure] extension
and compare the results to those of others (see the chart of timings at the [md5pure]).


** Tools **

   [PDQ with Tcl%|%PDQ]:   an analytical solver in which resources are represented by queues


** Timing Example **

Use `[time]` to compare to different coding techniques.  For this simple example, we will compare list creation using the lappend list command and the append string command.  But the technique can apply to many different situations.

Put your test code in a proc so that it gets byte-compiled.  Run enough iterations of the test code to make a meaningful timed test.  (Or maybe just use a slower computer.)

======
proc test1 {} {
    for {set i 0} {$i < 1000} {incr i} {
        lappend mylist $i
    }
}

proc test2 {} {
    for {set i 0} {$i < 1000} {incr i} {
        append mylist "$i "
    }
}
% time test1 100
349 microseconds per iteration
% time test2 100
1018 microseconds per iteration
======


----

[DKF]:  '''You should be aware of the fact that [UNIX] has a much finer time granularity than Windows''' (certainly [Windows 95%|%95]/[Winodws 98%|%98] and quite probably NT too.) 

''[KBK] 2000-12-24: The problem with Windows clock resolution is fixed in the latest alpha of 8.4.''

The best way to benchmark a tiny snippet of Tcl (especially where it is platform-independent code you're testing) is to use a slow [UNIX] box.  The slowness of its operation allows for sensitivity and the fine granularity of its timer lets you pick up what differences there are.  And then you have to pray that the relative speeds scale as you move to different platforms/architectures and/or faster machines.  Which is usually a good assumption if you're not [shimmering] between numbers and strings (where performance seems to vary a lot according to the vendor's implementation of the [C] library.)

[Unix] has millisecond-sized granularity (i.e. around 1000Hz or 1kHz.)  Windows has, for reasons relating to the hardware used in the original IBM-PC, 42.7Hz (or thereabouts) granularity.  (''KBK 2000-12-24: The latest 8.4 alpha uses the 1.193 MHz 8254 timer in place of the Windows 1- or 10-ms time quantum.'') Obviously, this makes it harder to perform finer measurements.  Good thing most operations in Tcl are actually platform-independent, so the relative differences in execution speed tend to carry across.

[KBK]:  The patch for using the 8254 clock in earlier Tcl releases is at [http://www.deja.com/getdoc.xp?AN=666545441&fmt=text] (''[AMG]: 404'').
Note that multi-CPU Windows systems are still going to be limited to the 1-ms resolution, because of a bug in the Hardware Abstraction Layer that Microsoft ships.

----
''Remember'', the actual numerical performance of computers varies widely with operating system and processor speed, and comparing actual figures is very tricky.  If you're going to put up benchmarking info, at least do everyone the favour of [Finding Out Your Processor and Operating System Configuration]...

----
[MAK]: While working on optimizations for [TkPNG], alternating between using [time] and gprof for various measurements, I was sometimes seeing my numbers change in odd ways with some going up and some going down for a change that should affect all images in one direction or the other.  So I was interested in seeing error margins to prove to myself I wasn't insane.  So I whipped up this "errortime" proc.  It works similarly to [time], but it runs the script through [time] with a given group size and does so the specified number of times, then calculates a new average and error margin.

======
proc errortime {script groupsize {iters 2}} {
    set i 0
    set times {} 
    if {$iters < 2} {set iters 2}

    for {set i 0} {$i < $iters} {incr i} {
        set result [uplevel [list time $script $groupsize]]
        lappend times [lindex $result 0]
    }

    set average 0.0
    set s2 0.0

    foreach time $times {
        set average [expr {$average + double($time)/$iters}]
    }

    foreach time $times {
        set s2 [expr {$s2 + (($time-$average)*($time-$average) / ($iters-1))}]
    }

    set sigma [expr {int(sqrt($s2))}]
    set average [expr int($average)]

    return "$average +/- $sigma microseconds per iteration"
}
======

Example:

======
proc test1 {} {
   for {set i 0} {$i < 1000} {incr i} {
       lappend mylist $i
   }
}

proc test2 {} {
    for {set i 0} {$ i <1000} {incr i} {
        append mylist "$i "
}
======

======none
% time test1 10000
855 microseconds per iteration
% time test2 10000
2376 microseconds per iteration
% errortime test1 100 100
843 +/- 42 microseconds per iteration
% errortime test2 100 100
2316 +/- 29 microseconds per iteration
======

----
'''[nektomk] - 2023-06-20 02:13:54'''

`time` results unstable and depends from position in script ?

======
proc mimo1 { a b } {
        tcl::mathfunc::min $a $b
}
proc mimo2 { a b } {
        expr { min($a,$b) } 
}

mimo1 200 20
mimo2 200 20
time {}
# slowest results
set c 0
puts [ time {
        incr c [ mimo2 100 10 ]
} 50000 ]
puts "----"
# faster results
set c 0
puts [ time {
        incr c [ mimo2 100 10 ]
} 50000 ]

======
usually the second result is faster, regardless of the measured functions (mimo1 or mimo2);


'''[HE] - 2023-06-21''': This is nothing I would subscribe. Without looking inside the source code my expectation of time is to take a timestamp in micro seconds in the beginning, execute the code the given number of times, take afterwards another timestamp and calculate the time per iteration.

Doing so the time measurement catches everything what happens to your computer during measurement. That can be additional load influencing the execution time of your script like regular background tasks. Or simply change of processor speed during execution.

You ([nektomk]) not wrote how long your tests take in average. On my computer around 0.9 microseconds. The differences between measurements with 50000 repetitions are around 0.3 microseconds. With 500000 repetitions it is only 0.02 microseconds. And it was in average faster than the smaller number of repetitions.

Why 'time' needs repetitions? One run of the code is far to fast executed to be possible to be measured. The available fasted reliable time fragment in Tcl is microseconds. And 0.9 is below the measurable. So Tcl repeat it x times and divide the result by repetitions.

Also we have nearly the same overhead for every measurements. That means it influence our result less with higher repetitions. On my computer it is 0.05 microseconds per test (calling test with an empty body) or 0.2 microseconds by also calculating the call of an empty procedure.

Such short time measurements are more accurate when the measurement time is around one or two seconds. If it is to short measurement errors are increasing. If it takes to long the risk is higher that the processor steps up to optimize the execution. Or during today summer weather, step down because of increasing processor temperature.

I also often execute tests a couple of times in changing sequence like test1, test2, test3, test3, test2, test1. That average influence from outside like changed processor speed.

As always it is important to check what we want to measure and how we build up the measurement setup.

======
% proc mimo0 { a b } {
        return
}
% proc mimo1 { a b } {
        tcl::mathfunc::min $a $b
}
% proc mimo2 { a b } {
        expr { min($a,$b) } 
}
% 
% time {mimo1 100 10 } 50000
0.94096 microseconds per iteration
% time {mimo1 100 10 } 50000
0.9596 microseconds per iteration
% time {mimo1 100 10 } 50000
0.93186 microseconds per iteration
% time {mimo1 100 10 } 50000
0.96122 microseconds per iteration
% time {mimo2 100 10 } 50000
0.90258 microseconds per iteration
% time {mimo2 100 10 } 50000
0.91626 microseconds per iteration
% time {mimo2 100 10 } 50000
0.92198 microseconds per iteration
% time {mimo2 100 10 } 50000
0.9034 microseconds per iteration

% time {mimo0 100 10 } 5000000
0.2092264 microseconds per iteration
% time {mimo0 100 10 } 5000000
0.2164206 microseconds per iteration
% time {mimo0 100 10 } 5000000
0.2109234 microseconds per iteration
% time {mimo0 100 10 } 5000000
0.2165014 microseconds per iteration

% time {mimo1 100 10 } 5000000
0.9046864 microseconds per iteration
% time {mimo1 100 10 } 5000000
0.9301398 microseconds per iteration
% time {mimo1 100 10 } 5000000
0.924608 microseconds per iteration
% time {mimo1 100 10 } 5000000
0.9071432 microseconds per iteration

% time {mimo2 100 10 } 5000000
0.8818558 microseconds per iteration
% time {mimo2 100 10 } 5000000
0.8812846 microseconds per iteration
% time {mimo2 100 10 } 5000000
0.8886016 microseconds per iteration
% time {mimo2 100 10 } 5000000
0.8705476 microseconds per iteration

% time {} 5000000
0.0523706 microseconds per iteration
% time {} 5000000
0.0517532 microseconds per iteration
% time {} 5000000
0.0524266 microseconds per iteration
% time {} 5000000
0.0512242 microseconds per iteration
======


<<categories>> Performance