Vincent Wartelle - mailto:[email protected] My temporary conclusions and where they come from.
A. OBJECTIVE CONCLUSIONS
on tcl 8.3, 32 bit machines only
each tcl object = any new string, number, date, value --> 24 bytes + content size any pre-existing string, data, value --> 4 bytes (pointer only) content size = depends on encoding and data type : one or two bytes per char for string values may be 0 for a number (integer/double), if it is never used as a string (therefore included in the core tcl object)
Jeffrey Hobbs comments: UTF-8 can go up to 3 bytes per char for the 2-byte unicode that Tcl uses internally. Also, content size can be greater for UnicodeString objects, List objects, ... that all malloc some extra space for their internal reps.
each variable = 48 bytes + "content size" of the name + "tcl object size" of the content each hash key entry = 48 bytes + "content size" of the key + "tcl object size" of the value each list = 32 bytes + size of each list entry each list entry = 4 bytes + "tcl object size" of the content
B. SUBJECTIVE CONCLUSIONS
52 bytes overhead for each variable, 52 bytes overhead for each hash table key
C. INFORMATION FROM NEWSGROUPS
1. excerpt from
[L1 ]
> On a 32 bit machine where alignment is 4 byte boundary > and the types have the > following sizes, > long 4 bytes > int 4 bytes > char * 4 bytes > double 8 bytes > void * 4 bytes > sizeof (Tcl_Obj) = 4 + 4 + 4 + 4 + MAX (4, 8, 4, 4 + 4) > = 24 bytes
2. excerpt from
[L2 ]
>> [experiment shows that...] approximately 54 bytes for each key. [...]
Well, it takes a certain amount of space to store the hash entry (four words plus the size of the key; median about 20 bytes in your case on a 32-bit machine) and more to store the variable (each entry in an array is an independent variable that can support its own traces, etc.) which adds another 8 words or 32 bytes. This gives about 52 bytes per array member; pretty close to what you report...
D. MY EXPERIMENTS
tclsh 8.3.2 with TCL_MEMORY_DEBUG on windows Millenium - 32 bit machine
1. hashtable with empty values memory info current bytes allocated 152681 ... % for {set i 0} {$i < 10000 } { incr i } { set t($i) "" } % memory info current bytes allocated 698453 ... 698453 - 152681 = 545772 approx 54 bytes per key. 2. hashtable with constant value memory info current bytes allocated 152550 ... % for {set i 0} {$i < 10000 } { incr i } { set t($i) "abcd" } % memory info current bytes allocated 698363 ... 698363 - 152550 = 545813 approx 54 bytes per key. 3. hashtable with variable value memory info current bytes allocated 152550 ... % for {set i 0} {$i < 10000 } { incr i } { set t($i) "abcd_$i" } % memory info current bytes allocated 1037220 ... 1037220 - 152550 = 884670 approx 89 bytes per key. 4. empty global variables % memory info current bytes allocated 152550 ... % for { set i 1 } { $i <= 10000 } { incr i } { set ::a[set i] "" } % memory info current bytes allocated 729761 ... 729761 - 152550 = 577211 approx 57 bytes per variable 5. global variables with the same value % memory info ... current bytes allocated 152550 % for { set i 1 } { $i <= 10000 } { incr i } { set ::a[set i] "abcd" } % memory info ... current bytes allocated 708202 708202 - 152550 = 555652 approx. 55 bytes per variable. 6. global variables with different values % memory info ... current bytes allocated 152550 % for { set i 1 } { $i <= 10000 } { incr i } { set ::a[set i] "abcd_$i" } % memory info ... current bytes allocated 1047070 1047070 -152550 = 894520 approx 89 bytes per variable. 7. empty list entries % memory info ... maximum bytes allocated 152550 % for {set i 1 } { $i <= 10000 } { incr i } { lappend l "" } % memory info ... current bytes allocated 202179 202179 - 152550 = 49629 approx 5 bytes per list entry. 8. identic list entries % memory info ... current bytes allocated 152550 % for {set i 1 } { $i <= 10000 } { incr i } { lappend ::l "abcd" } % memory info ... current bytes allocated 202215 202215 - 152550 = 49665 approx 5 bytes per list entry. 9. different list entries % memory info ... current bytes allocated 152550 % for {set i 1 } { $i <= 10000 } { incr i } { lappend ::l "abcd_$i" } % memory info ... current bytes allocated 541083 541083 - 152550 = 428533 approx 43 bytes per list entry.
interp costs? interp alias costs?
DKF - note that dict (as proposed in TIP #111 [L3 ]) will give hash access for memory costs much closer to that of a list and that of an array.