This page is the result of a comment on the [Better Arrays for Tcl9] page -- [JCW] ---- This demo illustrates an experimental "hash" command and datatype for Tcl, which uses the usual odd-even representation to store data, plus an internal integer "map" vector to represent a hashing layer placed on top. See http://www.equi4.com/previews/hash.tar.gz for all source code. ---- This sounds from an API point of view similar to [TclX]'s keyed lists. The mapping of course improves performance. ''Yes, but TclX's keyed lists are lists of 2-item lists, while the odd-even lists below are a single list... makes quite a difference in speed and mem-use --jcw'' ---- Differences between hash objects and arrays: * hash entries are not variables, this model does not support traces * order can be preserved, though inserts/deletes will take O(N) time * memory use is lower than arrays (2 ints per item, plus free slots) Performance-wise, the output below seems to indicate the following: * access is similar (the 29 vs. 20 uSec is maybe just call overhead) * creation from an odd-even list is over 25x faster than for arrays * full traversal takes the same amount of time for both * full traversal of the odd-even list is obviously faster for hashes * set/unset performance has not been measured (not implemented in C) The current code does not support set/unset yet. Setting would either add or replace, depending on whether the key already exists. Script: ---- puts "version = [package require jolt]" if {[info commands nop] != ""} { puts "10 nops = [time {nop;nop;nop;nop;nop;nop;nop;nop;nop;nop} 10000]" } proc hash_try {} { set a {1 one 2 two 3 three} puts "data = [hash a]" puts "length = [hash a length]" puts "keys = [hash a keys]" puts "values = [hash a values]" puts "map = [hash a map]" puts "get 1 = [hash a get 1]" puts "get 2 = [hash a get 2]" puts "get 3 = [hash a get 3]" foreach x {1 10 100 1000 10000 100000} { set v {} for {set y 0} {$y < $x} {incr y} { lappend v $y [expr {$y+$y}] } set n [expr {1000000/$x}] if {$n > 100} { set n 100 } set r [lindex [time {set a $v; hash a} ${n}0] 0] set s [lindex [time {array unset b; array set b $v} $n] 0] puts [format "create %6d = %5d hash, vs. %d array" $x $r $s] } foreach x {0 1000 abc} { catch {hash a get $x} e1 catch {set b($x)} e2 puts "get $x: $e1 <-> $e2" } set x 1 puts "10x h = [time { hash a get $x; hash a get $x; hash a get $x; hash a get $x; hash a get $x; hash a get $x; hash a get $x; hash a get $x; hash a get $x; hash a get $x} 10000]" puts "10x a = [time { set b($x); set b($x); set b($x); set b($x); set b($x); set b($x); set b($x); set b($x); set b($x); set b($x)} 10000]" puts "h all = [time {foreach x [hash a keys] { hash a get $x }}]" puts "a all = [time {foreach x [array names b] { set b($x) }}]" puts "same = [expr {[lsort [hash a]] == [lsort [array get b]]}]" puts "h list = [time {foreach {x y} [hash a] { }}]" puts "a list = [time {foreach {x y} [array get b] { }}]" } ---- Output (SuSE 7.1 Linux, PIII/650): ---- version = 2001.11.06.170201 10 nops = 7 microseconds per iteration HASH ==== data = 1 one 2 two 3 three length = 3 keys = 1 2 3 values = one two three map = 0 -1 0 -1 0 -1 0 -1 2105051955 2 -2061914958 4 0 -1 1977051568 0 get 1 = one get 2 = two get 3 = three create 1 = 5 hash, vs. 12 array create 10 = 5 hash, vs. 31 array create 100 = 6 hash, vs. 323 array create 1000 = 30 hash, vs. 2310 array create 10000 = 1370 hash, vs. 37315 array create 100000 = 15341 hash, vs. 415132 array get 0: 0 <-> 0 get 1000: 2000 <-> 2000 get abc: no such key <-> can't read "b(abc)": no such element in array 10x h = 29 microseconds per iteration 10x a = 20 microseconds per iteration h all = 557448 microseconds per iteration a all = 556954 microseconds per iteration same = 1 h list = 216386 microseconds per iteration a list = 542398 microseconds per iteration ----