Version 29 of A lightweight digraph package

Updated 2007-05-24 16:47:53 by NEM

NEM 23 May 2007: I've just moved house, and found myself last night needing to precompute some route plans for a group of agents embodied in a virtual environment. As I haven't even got a working phone line in the new house, I had no access to the wiki, and no copy of tcllib on my laptop. So I hacked up this lightweight directed-graph (digraph) package with some utilities for searching the space with a variety of strategies (breadth-first, depth-first, A* etc). The code is quite lightweight and fairly speedy. It makes extensive use of dicts and some other 8.5isms, but I expect a back-port to 8.4 wouldn't be too difficult.

Version 0.2: Fixed some bugs in the search and made node/edge methods take a graph variable rather than the graph itself.

Version 0.3: More fixes...

Version 0.4: And a few more. It is now possible to prevent the current node being expanded in digraph search by using the continue command. This can be useful if you want to avoid expanding the same node twice from different search paths. (Useful when using the uniform-cost strategy which guarantees in certain conditions that any node you encounter for the first time will be on the minimum route to that node).

Version 0.5: The digraph is now strict: there can only be a single edge in each direction between a pair of nodes (i.e., edges are now stored in a dict rather than a list). I may make this a switch.


 # digraph.tcl --
 #
 #       A basic directed graph library.
 #
 # Copyright (c) 2007 Neil Madden ([email protected]).
 #
 # RCS: $Id: digraph.tcl,v 1.3 2007/05/23 17:41:46 nem Exp $

 package require Tcl     8.5
 package provide digraph 0.5

 namespace eval digraph {
     namespace export {[a-z]*}
     namespace ensemble create

     # create --
     #
     #       Create an empty graph.
     #
     proc create {} { return [dict create] }
     # node graph name --
     #
     #       Create a new node in the graph if it doesn't already exist.
     #
     proc node {graphVar name} {
         upvar 1 $graphVar graph
         if {![dict exists $graph $name]} {
             dict set graph $name [dict create]
         }
         return $graph
     }

     proc exists {graph node} { dict exists $graph $node }

     # edge graph source dest ?cost [1]? --
     #
     #       Create a directed edge (arc) from source to dest in the graph,
     #       with an optional cost weighting.
     #
     proc edge {graphVar source dest {cost 1}} {
         upvar 1 $graphVar graph
         dict set graph $source $dest $cost
     }
     # size graph --
     #
     #       Return the number of edges in the graph.
     #
     proc size graph {
         set size 0
         dict for {node edges} $graph { incr size [dict size $edges] }
         return $size
     }

     # order graph --
     #
     #       Return the number of nodes in the graph.
     #
     proc order graph { dict size $graph }

     # nodes graph ?-progress cmd? nodeVar script --
     #
     #       Iterate through the nodes in the graph (in arbitrary order)
     #       calling a script for each node. If the -progress option is given
     #       then this command is called with the current iteration count and
     #       the total number of nodes on each iteration.
     #
     proc nodes {graph args} {
         array set options { -progress "" }
         array set options [lrange $args 0 end-2]
         lassign [lrange $args end-1 end] nodeVar script
         upvar 1 $nodeVar node
         set total [order $graph]
         set done 0
         foreach node [dict keys $graph] { 
             Invoke #0 $options(-progress) [incr done] $total
             Uplevel 1 $script 
         }
     }     

     # edges graph ?-progess cmd? {sourceVar destVar costVar} script --
     #
     #      Iterate through the edges in the graph (in arbitrary order)
     #      calling a script for each edge, passing in the source and
     #      destination nodes of the edge and the cost. If the -progress
     #      option is given then this command is called with the current
     #      iteration count and the total number of edges on each iteration.
     #
     proc edges {graph args} {
         array set options { -progress "" }
         array set options [lrange $args 0 end-2]
         lassign [lrange $args end-1 end] vars script
         lassign $vars sourceVar destVar costVar
         upvar 1 $sourceVar source $destVar dest $costVar cost
         # calculate total number of edges
         set total [size $graph]
         dict for {source dests} $graph {
             dict for {dest cost} $dests {
                 Invoke #0 $options(-progress) [incr done] $total
                 Uplevel 1 $script
             }
         }
     }
     # degree graph node --
     #
     #      Return the number of outgoing edges from a node in the graph.
     #
     proc degree {graph node} { dict size [dict get $graph $node] }

     # search graph source strategy {pathVar costVar} script --
     #
     #       Walk the entire graph structure from the source node without
     #       cycles. New nodes are searched in the order determined by the
     #       strategy command which queues new nodes in some order. For each
     #       new path encountered the script is called with the current path
     #       from the source node to that node and the total cost of that path.
     #
     proc search {graph source strategy vars script} {
         lassign $vars pathVar costVar
         upvar 1 $pathVar path $costVar cost
         set queue [list [list [list $source] 0]]
         Do {
             # pop first item off queue
             set queue [lassign $queue next]
             lassign $next path cost
             set node [lindex $path end]
             # visit the node
             Uplevel 1 $script
             # expand this node
             expand $graph $node $path {newPath newCost} {
                 set queue [Invoke 1 $strategy $queue $newPath \
                     [expr {$cost+$newCost}]]
             }
         } while {[llength $queue]}
     }
     # expand graph node path {pathVar costVar} script --
     #
     #       Iterate through all new paths that are reachable from "node" via
     #       an outgoing edge.
     #
     proc expand {graph node path vars script} {
         lassign $vars pathVar costVar
         upvar 1 $pathVar newPath $costVar cost
         if {![dict exists $graph $node]} { return }
         dict for {dest cost} [dict get $graph $node] {
             if {$dest ni $path} {
                 set newPath [linsert $path end $dest]
                 Uplevel 1 $script
             }
         }
     }
     # readdot dotfile --
     #
     #       Read a Graphviz .dot file and construct a digraph based on the
     #       information contained in it. Currently this just forms a simple
     #       unweighted graph and ignores most information in the file.
     #
     proc readdot {dotfile} {
         set in [open $dotfile]
         set g [create]
         foreach line [split [read $in] \n] {
             if {[regexp {(.*)->(.*);} $line -> source dest]} {
                 edge g [string trim $source] [string trim $dest]
             }
         }
         return $g
     }

     # STRATEGIES...
     proc depth-first {queue element cost} {
         # add to front of queue
         linsert $queue 0 [list $element $cost]
     }
     proc breadth-first {queue element cost} {
         # add to rear of queue
         linsert $queue end [list $element $cost]
     }
     # expand nodes with lowest cost-so-far first
     proc uniform-cost {queue element cost} {
         # find place to insert in queue
         set idx 0
         foreach elem $queue {
             lassign $elem path itemcost
             if {$itemcost >= $cost} { break }
             incr idx
         }
         linsert $queue $idx [list $element $cost]
     }
     # expand nodes with best (lowest) heuristic cost first
     proc best-first {f queue element cost} {
         set node [lindex $element end]
         set h [Invoke #0 $f $node]
         # find place to insert in queue
         set idx 0
         foreach item $queue {
             lassign $item path itemg itemh
             if {$itemh >= $h} { break }
             incr idx
         }
         linsert $queue $idx [list $element $cost $h]
     }
     proc a-star {heuristic queue element cost} {
         best-first [namespace code [list A* $heuristic $cost]] \
             $queue $element $cost
     }

     # Private helper utilities.
     proc A* {h cost element} { expr {[Invoke #0 $h $element]+$cost} }
     proc Do {script _while_ cond} {
         set rc [catch { uplevel 1 $script } result]
         if {$rc == 3} { return }
         if {$rc != 0 && $rc != 4} { return -code $rc $result }
         Uplevel 1 [list while $cond $script]
     }
     proc Invoke {level cmd args} {
         if {[llength $cmd] == 0} { return }
         if {[string is integer -strict $level]} { incr level }
         uplevel $level $cmd $args
     }
     proc Uplevel {level script} {
         if {[string is integer -strict $level]} { incr level }
         set rc [catch { uplevel $level $script } result]
         if {$rc == 0} { return }
         if {$rc != 1} { return -code $rc $result }
         return -code error -errorcode $::errorCode -errorinfo $::errorInfo $result
     }
 }

Example

I'll try and update my pages on state space searching to use this simple package now. Until then, here is the example code that I was using to generate least-cost routes (I store the routes in an existing SQLite database):

 package require Tcl     8.5
 package require digraph 0.1
 package require sqlite3 3.3

 proc routes {graph file} {
   sqlite3 db $file
   db transaction {
     digraph nodes $graph -progress ::progress source {
       set visited [dict create]
       digraph search $graph $source {digraph least-cost} path {
         set dest [lindex $path end]
         if {[dict exists $visited $dest]} { continue } ;# don't expand again
         dict set visited $dest 1
         db eval {INSERT INTO route VALUES($source, $dest, $path)}
       }
     }
   }
   db close
 }
 proc progress {done total} {
   set percent [expr {int(double($done)/double($total)*100)}]
   puts -nonewline [format "\rProgress: |%-50.50s| %3d%% (%d/%d)" \
     [string repeat "=" [expr {$percent/2}]]> $percent \
     $done $total]
   flush stdout
 }
 set rhun [digraph readdot ~/Desktop/Rhun_map.dot]
 puts "Computing routes..."
 routes $rhun rhun_routes.db
 puts "\nDone."

Lars H: Hmm... Is this something like the real problem you're after, or just an example? (Minimum-cost path (a.k.a. shortest path) is one of the basic graph problems for which there exist low order polynomial (i.e., fast) algorithms, but searching the space of all paths isn't the right way to do it.) The digraph package itself looks nice, but a problem for many algorithms is that it only gives you quick access to edges leaving a vertex; it is often also necessary to know which edges are coming in to a vertex.

NEM: Well, for my problems I only care about outgoing edges, hence the bias in the package. I'd be interested in pointers to faster minimum-cost algorithms, but the basic approach here worked fine -- it generated all ~170000 routes I needed in under 10 minutes, which is fine for a one-off operation. I'm currently working on an implementation of Dijkstra's algorithm, if that is what you were referring to.

Lars H: OK, the must-know algorithm for Shortest Path is Dijkstra's algorithm [L1 ]. One that is more immediately lends itself to your package is the Moore–Bellman–Ford algorithm [L2 ], which can be coded as follows:

  proc MBF {G source} {
     digraph nodes $G v {set l($v) infinity}
     set l($source) 0
     for {set count [digraph size $G]} {$count>1} {incr count -1} {
        digraph edges $G {u v c} {
           if {$l($u) != "infinity" && $l($v) > $l($u) + $c} then {
              set l($v) [expr {$l($u) + $c}]
              set p($v) $u
           }
        }
     }
     set res {}
     foreach v [array names p] {
        lappend res $v [list $l($v) $p($v)]
     }
     return $res
  }

(Untested, I don't have an 8.5 immediately available.) MBF returns a dictionary with one entry for every vertex reachable from $source, whose entries are lists of two elements. The first element is the length of the shortest (cost of the cheapest) path from $source to that vertex. The second element is the previous vertex on that path. Dijkstra's algorithm is faster (particularly so for dense graphs), whereas Moore–Bellman–Ford can cope with cycles of negative weight in the graph.

NEM: Thanks for this, I'll try and integrate it tonight. (Note: in recent Tcls expr recognises "Inf" as infinity). I'll also see about implementing Floyd's algorithm or some other all-pairs shortest-path algorithm, as that is what I primarily needed this package for (I am pre-computing all routes between all pairs of locations in a virtual environment so that agents only have to do local path-finding at runtime). Feel free to update the code directly if you have good ideas (e.g., new algorithms, or if you want to adjust the representation to make it quicker to find incoming edges).

BTW, the "size" procedure should really be named "order". The size of a graph is the number of edges.

NEM: Fixed, thanks.


LV Is there any thought to adding your package to tcllib, or at least incorporating what you have learned into whatever relevant module might already be there?

NEM Tcllib already has a graph package which is roughly a superset of the functionality here. The main differences are that the graphs here are values (strings) rather than opaque "objects", and the search methods here are different to the ones in tcllib. The tcllib graph has a "walk" method which includes breadth-first and depth-first searches, and could perhaps incorporate a general search strategy queuing function, as here, but I don't know.


See also Heuristic Searches for another example.


[ Category Data Structure ]