This page is the result of a comment on the Better Arrays for Tcl9 page -- JCW
This demo illustrates an experimental "hash" command and datatype for Tcl, which uses the usual odd-even representation to store data, plus an internal integer "map" vector to represent a hashing layer placed on top. See http://www.equi4.com/previews/hash.tar.gz for all source code.
This sounds from an API point of view similar to TclX's keyed lists. The mapping of course improves performance.
Yes, but TclX's keyed lists are lists of 2-item lists, while the odd-even lists below are a single list... makes quite a difference in speed and mem-use --jcw
Differences between hash objects and arrays:
Performance-wise, the output below seems to indicate the following:
The current code does not support set/unset yet. Setting would either add or replace, depending on whether the key already exists.
Script:
puts "version = [package require jolt]" if {[info commands nop] != ""} { puts "10 nops = [time {nop;nop;nop;nop;nop;nop;nop;nop;nop;nop} 10000]" } proc hash_try {} { set a {1 one 2 two 3 three} puts "data = [hash a]" puts "length = [hash a length]" puts "keys = [hash a keys]" puts "values = [hash a values]" puts "map = [hash a map]" puts "get 1 = [hash a get 1]" puts "get 2 = [hash a get 2]" puts "get 3 = [hash a get 3]" foreach x {1 10 100 1000 10000 100000} { set v {} for {set y 0} {$y < $x} {incr y} { lappend v $y [expr {$y+$y}] } set n [expr {1000000/$x}] if {$n > 100} { set n 100 } set r [lindex [time {set a $v; hash a} ${n}0] 0] set s [lindex [time {array unset b; array set b $v} $n] 0] puts [format "create %6d = %5d hash, vs. %d array" $x $r $s] } foreach x {0 1000 abc} { catch {hash a get $x} e1 catch {set b($x)} e2 puts "get $x: $e1 <-> $e2" } set x 1 puts "10x h = [time { hash a get $x; hash a get $x; hash a get $x; hash a get $x; hash a get $x; hash a get $x; hash a get $x; hash a get $x; hash a get $x; hash a get $x} 10000]" puts "10x a = [time { set b($x); set b($x); set b($x); set b($x); set b($x); set b($x); set b($x); set b($x); set b($x); set b($x)} 10000]" puts "h all = [time {foreach x [hash a keys] { hash a get $x }}]" puts "a all = [time {foreach x [array names b] { set b($x) }}]" puts "same = [expr {[lsort [hash a]] == [lsort [array get b]]}]" puts "h list = [time {foreach {x y} [hash a] { }}]" puts "a list = [time {foreach {x y} [array get b] { }}]" }
Output (SuSE 7.1 Linux, PIII/650):
version = 2001.11.06.170201 10 nops = 7 microseconds per iteration HASH ==== data = 1 one 2 two 3 three length = 3 keys = 1 2 3 values = one two three map = 0 -1 0 -1 0 -1 0 -1 2105051955 2 -2061914958 4 0 -1 1977051568 0 get 1 = one get 2 = two get 3 = three create 1 = 5 hash, vs. 12 array create 10 = 5 hash, vs. 31 array create 100 = 6 hash, vs. 323 array create 1000 = 30 hash, vs. 2310 array create 10000 = 1370 hash, vs. 37315 array create 100000 = 15341 hash, vs. 415132 array get 0: 0 <-> 0 get 1000: 2000 <-> 2000 get abc: no such key <-> can't read "b(abc)": no such element in array 10x h = 29 microseconds per iteration 10x a = 20 microseconds per iteration h all = 557448 microseconds per iteration a all = 556954 microseconds per iteration same = 1 h list = 216386 microseconds per iteration a list = 542398 microseconds per iteration