if 0 {[Richard Suchenwirth] 2003-03-18 - Trees are a fundamental graph and data structure. They consist of nodes, where each node has a content (e.g. a string) and zero or more child nodes. Each node except the "root" has exactly one parent node. In Tcl, trees can be represented in various ways. Since 8.4, nested lists make an efficient tree representation, where access goes with [lset] and multi-index [lindex]. The following routine traverses such a tree and returns a list of node indices that can be used to iterate with [foreach] and [lindex] to access each node in sequence. For a silly example, consider the following directory tree: / /bin /usr /usr/bin /usr/local /usr/local/bin /usr/local/lib which as a nested list, where each node is a directory, can very compactly be written as {"" bin {usr bin {local bin lib}}} The list of all node indices is 0 1 {2 0} {2 1} {2 2 0} {2 2 1} {2 2 2} which, when iterated over with [lindex], enumerates all directory basenames: 0: 1:bin 2 0:usr 2 1:bin 2 2 0:local 2 2 1:bin 2 2 2:lib and, with the additional code in ''absolutePath'' and ''fromRoot'', can also reconstruct the absolute paths (with an anomaly in /, which comes as empty string - but that's not a bug of these algorithms, but a peculiarity that Unix-like pathnames, Tk widget pathnames, Tcl namespace names have in common): 0:, 1:bin,/bin 2 0:usr,/usr 2 1:bin,/usr/bin 2 2 0:local,/usr/local 2 2 1:bin,/usr/local/bin 2 2 2:lib,/usr/local/lib We observe that "leaves", i.e. nodes which have no children, have a nonzero as last index element, while nodes with children have a zero there. If you chop the trailing zero off, [lindex] gives you the subtree starting from that node. ''[Lars H], 13 May 2005: No, '''all''' node indices should end with a zero. The only reason it works to leave it out in this example is that all node contents are equal to their [list]-quoted forms. Consider the following tree (containing the Swedish monarchs of the Vasa dynasty):'' set tree {{Gustav Vasa} {{Erik XIV}} {{Johan III} Sigismund} {{Karl IX} {{Gustav II Adolf} Kristina}}} ''The proper index of Erik XIV is "1 0" despite him being a leaf, because he has a space in his name. "1" is the index for the subtree containing only that leaf, but that still has list-quoting in place.'' An alternative would have been to represent each node as a pair {content children}, where the children are again a list. This would however lead to a much higher nesting depth: {"" {{bin {}} {usr {{bin {}} {local {{bin {}} {lib {}}}}}}}} while making the algorithms slightly simpler. As the procedures are written once, but hopefully used on many big trees, I decided for the simpler data representation. } proc traverse {tree {prefix ""}} { set res {} if {[llength $tree]>1} { lappend res [concat $prefix 0] ;# content set i 0 foreach child [lrange $tree 1 end] { eval lappend res [traverse $child [concat $prefix [incr i]]] } } else {set res [list $prefix]} ;# leaf set res } proc fromRoot index { set res {} set path {} foreach i $index { if $i {lappend res [concat $path 0]} lappend path $i } lappend res $index } proc absolutePath {tree index} { set res {} foreach i [fromRoot $index] { lappend res [lindex $tree $i] } set res } if 0 {Of course we want to modify such trees too - here's a first shot which inserts into a given tree, at given node ID, another (sub)tree (which might of course be just a single node) as child of the specified node. See usage examples in the test code at bottom: } proc addSubtree {tree index subtree} { if {[lindex $index end]==0} {set index [lrange $index 0 end-1]} set node [lindex $tree $index] lappend node $subtree lset tree $index $node set tree } #------------ Testing: set testtree {"" bin {usr bin {local bin lib}}} puts [traverse $testtree] foreach i [traverse $testtree] { puts $i:[lindex $testtree $i],[join [absolutePath $testtree $i] /] } set testtree [addSubtree $testtree {2 0} lib] set testtree [addSubtree $testtree {2 3} tcl8.4] puts "added /usr/lib" foreach i [traverse $testtree] { puts $i:[lindex $testtree $i],[join [absolutePath $testtree $i] /] } if 0 {More tree routines: Determining the '''parent of a node''', given its index, can be done without having to look at the tree itself. We have to distinguish the cases of a non-leaf, where we first chop off the trailing 0. As parent for root an empty string is returned by the second, "one-armed" [if] - the empty string result should be checked after calling: it is a valid index, but one that returns the whole tree if used with [lindex]. } proc parent index { if {[lindex $index end]==0} {set index [lrange $index 0 end-1]} if {$index != ""} {lreplace $index end end 0} } ---- Pierre Coueffin - 2005-05-12 I wanted to embed a graphical representation of some nested-list type trees that my code generates into a paper I am working on. I came up with the following code, which needs BWidget to work. proc gui {w tree} { package require BWidget Tree $w foreach i [lrange [traverse $tree] 1 end] { set parent [absolutePath $tree [parent $i]] if {$parent == {{}} } { set parent root } set node [absolutePath $tree $i] set text [lindex $tree $i] $w insert end $parent $node -text $text -open yes } return $w } if 0 { Then I do: pack [gui .tree $treedata] and I can generate a nice postscript representation by abusing the knowledge that BWidget uses a canvas widget to draw trees on: set postscript [.tree.c postscript] } 2009-02-10 I noticed one problem with the gui proc above. It assigns node names based on the text in the list item. The problem with that is, in some applications, you may have duplicate text in different list members. It raises an error when it tries to name a new node with duplicate text if the nodes are at the same level, i.e. siblings, in the tree. You need a unique ID for each new node. I changed the proc like this, and it seems to work better. proc gui {w tree} { package require BWidget Tree $w -width 30 -height 35 # the variable i here seems to be unique to each node. It comes from the traverse proc and seems to assign # a unique value for each item in the tree, so I used that for the node names. Now I can use the same text in sibling # list items without causing the error. foreach i [lrange [traverse $tree] 1 end] { set parent [absolutePath $tree [parent $i]] if {$parent == {{}} } { set parent root } if {$parent != "root"} { set parent [parent $i] } set node "$i" set text [lindex $tree $i] $w insert end $parent $node -text $text -open yes } return $w } Here's some test data I used that did not work with the first proc, but does seem to work with the above changes: For example, the three instances of "appearance" at the same level in the list would cause an error before, but now are tolerated. # Patient with a sore throat? set treedata { "" {{appearance} {no_distress}} {Viral} {{appearance} {toxic}} {Epiglottitis} {{appearance} {uncomfortable} {{exudate} {no} {{ulcers} {no}} {Viral} {{ulcers} {yes}} {{Herpes_Stomatitis}}} {{exudate} {yes} {{temp} {100.5}} {Mononucleosis} {{temp} {101}} {Mononucleosis} {{temp} {103}} {Streptococcal}}} } if 0 { ---- See also: * [Binary trees] * [Decision trees] * [Complex data structures] ---- !!!!!! %| [Category Concept] | [Category Data Structure] |% [Arts and crafts of Tcl-Tk programming] !!!!!! }