Directory Tree Synchronization

George Peter Staplin: I keep some backups on a 2nd hard disk, as a sort of poor-man's RAID/backup (in addition to CD-R backups). I use the following tool to synchronize directory trees.

Revision History

Aug 31, 2004: the first version

Nov 23, 2004: new -nocase flag and better/simpler implementation overall

Syntax

$ sync_trees.tcl

syntax: ?-nocase? tree-a tree-b

The -nocase flag is useful for synchronizing a FAT (caseless) file system with a unix file system that is case sensitive.


 #!/usr/bin/env tclsh8.4
 #Copyright 2004 (c) George Peter Staplin

 proc build.tree.stat.info {ar_ptr dir} {
  upvar $ar_ptr ar
  array set ar {}
  foreach f [glob -nocomplain [file join $dir *]] {
   if {[file isdirectory $f]} {
    build.tree.stat.info ar $f
   } elseif {[file isfile $f]} {
    set ar($f) [file size $f]
   }
  }
 }

 proc copy.from.to {from to} {
  puts "COPYING $from $to"
  file mkdir [file dirname $to]
  file copy -force $from $to
 }

 proc sync.tree {a_ptr FROM_DIR b_ptr TO_DIR} {
  upvar $a_ptr a
  upvar $b_ptr b

  foreach {f size} [array get a] {
   if {![info exists b($f)]} {
    copy.from.to [file join $FROM_DIR $f] [file join $TO_DIR $f]
   } elseif {$size != [set b($f)]} {
     puts stderr "The size for $f in $FROM_DIR doesn't match $TO_DIR."
   }
  }
 }

 proc sync.tree.caseless {from FROM_DIR to TO_DIR} {
  foreach {f size} $from {
   set do_copy 1
   foreach {tof tosize} $to {
    if {[string equal -nocase $f $tof]} {
     set do_copy 0
     break
    }
   }
   if {$do_copy} {
    copy.from.to [file join $FROM_DIR $f] [file join $TO_DIR $f]
   } elseif {$tosize != $size} {
    puts stderr "The size for $f in $FROM_DIR doesn't match $TO_DIR."
   }
  }
 }

 proc syntax {} {
  puts stderr "syntax: ?-nocase? tree-a tree-b"
 }

 proc main {argc argv} {

  set caseless 0

  switch -- $argc {
   2 {
    foreach {tree_a tree_b} $argv {}
   }

   3 {
    foreach {flag tree_a tree_b} $argv {}
    if {![string match -nocase -noc* $flag]} {
     syntax
     return 1
    }
    set caseless 1
   }

   default {
    syntax
    return 1
   } 
  }

  set oldwd [pwd]
  cd [set DIR_A [file normalize $tree_a]]
  build.tree.stat.info a {}

  cd $oldwd
  cd [set DIR_B [file normalize $tree_b]]
  build.tree.stat.info b {}

  if {$caseless} {
   sync.tree.caseless [array get a] $DIR_A [array get b] $DIR_B
   sync.tree.caseless [array get b] $DIR_B [array get a] $DIR_A
  } else {
   sync.tree a $DIR_A b $DIR_B
   sync.tree b $DIR_B a $DIR_A
  }
  return 0
 }
 exit [main $::argc $::argv]

Lars H: Bernard Desgraupes has written a similar feature "Sync Trees" [L1 ] for use in Alpha.

I'm putting a snitvfs over tclhttpd, and of course I need to move stuff into the vfs, and of course I need something like this. Now it's a near no-brainer. I love little snippets that delineate issues and pitfalls, even if I don't use them they save me time. I love youse all. CMcC

See also the the "sync" code by Matt Newman - it has been in SDX for several years now [L2 ] -jcw

SEH -- assuming, for the sake of semantic clarity, that there is a difference between "sync" (which doesn't delete any files, but ensures that both directories have a copy of the union of files in both locations), and "mirror" (which may destroy some files in the target directory in order to make it match the source directory perfectly), into what category does each of the above mentioned programs fall? The code posted here seems to be of the "sync" variety. I have a procedure of the "mirror" variety, which I can post here if it's not redundant.


See also tfu and tddiff.