See also Tcl IO performance, where we reached the
Interim conclusion The time spent in the original benchmark seems to be decomposable in
Further experiments
provide the following timings for the most-exercised opcodes (measured in cpu ticks at 1.6MHz):
8.5a6 time: 1822167696, count: 20004698 ---------------------------------------------------- op %T avgT %ops Nops 6 24.53 446.87 5.00 1000323 INST_INVOKE_STK1 80 22.68 413.32 5.00 1000019 INST_LIST_INDEX 29 10.04 91.47 10.00 2000048 INST_INCR_SCALAR1_IMM 105 9.77 44.52 20.00 4000452 INST_START_CMD 103 7.58 138.16 5.00 1000004 INST_LIST_INDEX_IMM 10 7.20 26.22 25.00 5000750 INST_LOAD_SCALAR1 17 5.78 52.67 10.00 2000236 INST_STORE_SCALAR1 47 5.23 95.32 5.00 1000001 INST_LT 1 4.19 38.11 10.00 2001039 INST_PUSH1 3 2.95 53.83 5.00 1000302 INST_POP
Striking observations (which require an explanation)
Taking the fastest opcode as comparison basis:
(do note that INST_LOAD_SCALAR1 provides a generous upper bound on the cost of opcode dispatch - and the possible savings when improving that part only)
The script being run is
lappend auto_path /home/CVS/emptyFunc/ package require empty exec /usr/bin/taskset -p 0x00000001 [pid] proc main N { set y 0 set a [list foo boo moo] for {set i 0} {$i < $N} {incr i} { empty 1 incr y set z [lindex $a 1] set z 1 lindex $a $z } } if {[llength $argv]} { main [lindex $argv 0] } else { main 1000000 }