Version 0 of Trees as nested lists

Updated 2003-03-18 08:15:28

if 0 {Richard Suchenwirth 2003-03-18 - Trees are a fundamental graph and data structure. They consist of nodes, where each node has a content (e.g. a string) and zero or more child nodes. Each node except the "root" has exactly one parent node.

In Tcl, trees can be represented in various ways. Since 8.4, nested lists make an efficient tree representation, where access goes with lset and multi-index lindex. The following routine traverses such a tree and returns a list of node indices that can be used to iterate with foreach and lindex to access each node in sequence.

For a silly example, consider the following directory tree:

 /
 /bin
 /usr
 /usr/bin
 /usr/local
 /usr/local/bin
 /usr/local/lib

which as a nested list, where each node is a directory, can very compactly be written as

 {"" bin {usr bin {local bin lib}}}

The list of all node indices is

 0 1 {2 0} {2 1} {2 2 0} {2 2 1} {2 2 2}

which, when iterated over with lindex, enumerates all directory basenames:

 0:
 1:bin
 2 0:usr
 2 1:bin
 2 2 0:local
 2 2 1:bin
 2 2 2:lib

and, with the additional code in absolutePath and fromRoot, can also reconstruct the absolute paths (with an anomaly in /, which comes as empty string - but that's not a bug of these algorithms, but a peculiarity that Unix-like pathnames, Tk widget pathnames, Tcl namespace names have in common):

 0:,
 1:bin,/bin
 2 0:usr,/usr
 2 1:bin,/usr/bin
 2 2 0:local,/usr/local
 2 2 1:bin,/usr/local/bin
 2 2 2:lib,/usr/local/lib

We observe that "leaves", i.e. nodes which have no children, have a nonzero as last index element, while nodes with children have a zero there. If you chop the trailing zero off, lindex gives you the subtree starting from that node.

An alternative would have been to represent each node as a pair {content children}, where the children are again a list. This would however lead to a much higher nesting depth:

 {"" {{bin {}} {usr {{bin {}} {local {{bin {}} {lib {}}}}}}}}

while making the algorithms slightly simpler. As the procedures are written once, but hopefully used on many big trees, I decided for the simpler data representation. }

 proc traverse {tree {prefix ""}} {
    set res {}
    if {[llength $tree]>1} {
        lappend res [concat $prefix 0] ;# content
        set i 0
        foreach child [lrange $tree 1 end] {
            eval lappend res [traverse $child [concat $prefix [incr i]]]
        }
    } else {set res [list $prefix]} ;# leaf
    set res
 }
 proc fromRoot index {
    set res {}
    set path {}
    foreach i $index {
        if $i {lappend res [concat $path 0]}
        lappend path $i
    }
    lappend res $index
 }
 proc absolutePath {tree index} {
    set res {}
    foreach i [fromRoot $index] {
        lappend res [lindex $tree $i]
    }
    set res
 }

#------------ Testing:

 set testtree {"" bin {usr bin {local bin lib}}}
 puts [traverse $testtree]
 foreach i [traverse $testtree] {
    puts $i:[lindex $testtree $i],[join [absolutePath $testtree $i] /]
 }

if 0 {


Category Concept | Arts and crafts of Tcl-Tk programming