[Vincent Wartelle] - mailto:vwartelle@hotmail.com My temporary conclusions and where they come from. '''A. OBJECTIVE CONCLUSIONS''' on tcl 8.3, 32 bit machines only each tcl object = any new string, number, date, value --> 24 bytes + content size any pre-existing string, data, value --> 4 bytes (pointer only) content size = depends on encoding and data type : one or two bytes per char for string values may be 0 for a number (integer/double), if it is never used as a string (therefore included in the core tcl object) [Jeffrey Hobbs] comments: UTF-8 can go up to 3 bytes per char for the 2-byte unicode that Tcl uses internally. Also, content size can be greater for UnicodeString objects, List objects, ... that all malloc some extra space for their internal reps. each variable = 48 bytes + "content size" of the name + "tcl object size" of the content each hash key entry = 48 bytes + "content size" of the key + "tcl object size" of the value each list = 32 bytes + size of each list entry each list entry = 4 bytes + "tcl object size" of the content '''B. SUBJECTIVE CONCLUSIONS''' * When using TCL, don't emulate pointer mechanisms. Copy the complete data when needed. TCL will replace redundant data by pointers. * Each different "thing" in a tcl program will cost 24 bytes * Variables and hash-tables are costly: 52 bytes overhead for each variable, 52 bytes overhead for each hash table key * Lists are not costly: 4 bytes overhead for each element. (Yes, far more if each element is itself a list...) '''C. INFORMATION FROM NEWSGROUPS''' 1. excerpt from [http://groups.google.com/groups?hl=fr&lr=&safe=off&ic=1&th=38a358d1b5875c48,17&seekm=39509312.55CF52D1%40hursley.ibm.com#p] > On a 32 bit machine where alignment is 4 byte boundary > and the types have the > following sizes, > long 4 bytes > int 4 bytes > char * 4 bytes > double 8 bytes > void * 4 bytes > sizeof (Tcl_Obj) = 4 + 4 + 4 + 4 + MAX (4, 8, 4, 4 + 4) > = 24 bytes 2. excerpt from [http://groups.google.com/groups?hl=fr&lr=&safe=off&ic=1&th=8ae402debdf9a73f,5&seekm=3B014995.9F833320%40mayo.edu#p] >> [experiment shows that...] approximately 54 bytes for each key. [...] Well, it takes a certain amount of space to store the hash entry (four words plus the size of the key; median about 20 bytes in your case on a 32-bit machine) and more to store the variable (each entry in an array is an independent variable that can support its own traces, etc.) which adds another 8 words or 32 bytes. This gives about 52 bytes per array member; pretty close to what you report... '''D. MY EXPERIMENTS''' tclsh 8.3.2 with TCL_MEMORY_DEBUG on windows Millenium - 32 bit machine 1. hashtable with empty values memory info current bytes allocated 152681 ... % for {set i 0} {$i < 10000 } { incr i } { set t($i) "" } % memory info current bytes allocated 698453 ... 698453 - 152681 = 545772 approx 54 bytes per key. 2. hashtable with constant value memory info current bytes allocated 152550 ... % for {set i 0} {$i < 10000 } { incr i } { set t($i) "abcd" } % memory info current bytes allocated 698363 ... 698363 - 152550 = 545813 approx 54 bytes per key. 3. hashtable with variable value memory info current bytes allocated 152550 ... % for {set i 0} {$i < 10000 } { incr i } { set t($i) "abcd_$i" } % memory info current bytes allocated 1037220 ... 1037220 - 152550 = 884670 approx 89 bytes per key. 4. empty global variables % memory info current bytes allocated 152550 ... % for { set i 1 } { $i <= 10000 } { incr i } { set ::a[set i] "" } % memory info current bytes allocated 729761 ... 729761 - 152550 = 577211 approx 57 bytes per variable 5. global variables with the same value % memory info ... current bytes allocated 152550 % for { set i 1 } { $i <= 10000 } { incr i } { set ::a[set i] "abcd" } % memory info ... current bytes allocated 708202 708202 - 152550 = 555652 approx. 55 bytes per variable. 6. global variables with different values % memory info ... current bytes allocated 152550 % for { set i 1 } { $i <= 10000 } { incr i } { set ::a[set i] "abcd_$i" } % memory info ... current bytes allocated 1047070 1047070 -152550 = 894520 approx 89 bytes per variable. 7. empty list entries % memory info ... maximum bytes allocated 152550 % for {set i 1 } { $i <= 10000 } { incr i } { lappend l "" } % memory info ... current bytes allocated 202179 202179 - 152550 = 49629 approx 5 bytes per list entry. 8. identic list entries % memory info ... current bytes allocated 152550 % for {set i 1 } { $i <= 10000 } { incr i } { lappend ::l "abcd" } % memory info ... current bytes allocated 202215 202215 - 152550 = 49665 approx 5 bytes per list entry. 9. different list entries % memory info ... current bytes allocated 152550 % for {set i 1 } { $i <= 10000 } { incr i } { lappend ::l "abcd_$i" } % memory info ... current bytes allocated 541083 541083 - 152550 = 428533 approx 43 bytes per list entry. ---- [Arts and crafts of Tcl-Tk programming]