See also Tcl IO performance, where we reached the
Interim conclusion The time spent in the original benchmark seems to be decomposable in
Further experiments
provide the following timings for the most-exercised opcodes (measured in cpu ticks at 1.6MHz):
8.5a6 time: 1822167696, count: 20004698 ---------------------------------------------------- op %T avgT %ops Nops 6 24.53 446.87 5.00 1000323 INST_INVOKE_STK1 80 22.68 413.32 5.00 1000019 INST_LIST_INDEX 29 10.04 91.47 10.00 2000048 INST_INCR_SCALAR1_IMM 105 9.77 44.52 20.00 4000452 INST_START_CMD 103 7.58 138.16 5.00 1000004 INST_LIST_INDEX_IMM 10 7.20 26.22 25.00 5000750 INST_LOAD_SCALAR1 17 5.78 52.67 10.00 2000236 INST_STORE_SCALAR1 47 5.23 95.32 5.00 1000001 INST_LT 1 4.19 38.11 10.00 2001039 INST_PUSH1 3 2.95 53.83 5.00 1000302 INST_POP
Striking observations (which require an explanation)
Taking the fastest opcode as comparison basis:
1 INST_LOAD_SCALAR_STK1 is the fastest opcode (faster that INST_PUSH1 and INST_POP!), INST_STORE_STACK1 is pretty fast too 2 [lindex] is amazingly slow - the non-immediate version is as slow as a command invocation 3 command invocations are expensive (remark that only '''empty''' is invoked in the loop) 4 comparisons (INST_LT) and basic arithmetic (INST_INCR_SCALAR_IMM1) are amazingly slow when compared to basic variable access 5 the "pure loss" INST_START_CMD is amazingly slow
The script being run is
lappend auto_path /home/CVS/emptyFunc/ package require empty exec /usr/bin/taskset -p 0x00000001 [pid] proc main N { set y 0 set a [list foo boo moo] for {set i 0} {$i < $N} {incr i} { empty 1 incr y set z [lindex $a 1] set z 1 lindex $a $z } } if {[llength $argv]} { main [lindex $argv 0] } else { main 1000000 }