Version 31 of A versioning virtual filesystem

Updated 2007-05-20 23:03:56 by dkf

SEH A versioning virtual filesystem.

Similar to historical versioning filesystems, each edited version of a file is saved separately; each version file is tagged with a timestamp. A tag is a semi-colon and tag info appended to the file name. A deleted file is represented by a new zero-length file with timestamp and a tag reading "deleted". By default only the latest versions of files are visible. If the latest version is marked deleted, it is invisible.

Based on A template virtual filesystem


SEH 19 Jul 2004 - I added three optional mount flags: -keep , -project , and -time

Usage: Mount <existing directory> -keep <integer> -project <string> -time <seconds or string acceptable to clock scan> <virtual versioning directory>

-keep: sets the maximum number of versions per file to retain. If the -project tag is used, other projects' versions are ignored.

-project: specify a named project and all new edits get tagged with this name string. As long as you're mounted with this option, versions with the corresponding tag will be preferred, and edits tagged by other projects will be invisible. Multiple -project tags can be used; new versions will tagged with all specified project tags - thus you can share edits among several projects.

-time: specify a time in the past and the vfs will appear just as it did at that time. You can make new edits, but they will be invisible until you remount without a time specified. Can be used in conjunction with -project.


SEH 4 Nov 2004 - Cleaned up and tested.

1/7/2005 -- Directories are now time-stamped and project-tagged just like files, so a whole directory can be marked as deleted for one project but still in existence for another.

I also increased the sophistication of how versions with multiple tags and multiple matches with the -project list are chosen. It used to be that the first version found with any matches was chosen, now the version with the most matches is. That is if you mount with the -project flag set as "Ford Mercury Cougar 72" and there are two versions of manual.pdf, one with tags "Ford Mercury" and one with "Ford Cougar" , previously you couldn't sure which version would be visible. Now you will always get the one you expect (i.e., the one tagged "Ford Cougar").


 # versionvfs.tcl --
 #
 #        A versioning virtual filesystem.
 #
 #        Similar to historical versioning filesystems, each edited version of a file is saved separately;
 #        each version file is tagged with a timestamp, and optional project tags.
 #        A deleted file is represented by a new zero-length file with timestamp and a tag reading "deleted".
 #        By default only the latest version is visible.  If the latest version is marked deleted, it is invisible.
 #        Directories are versioned and tagged in the same way as files.
 #
 #
 # Written by Stephen Huntley ([email protected])
 #
 # Install: This code requires that the template vfs (http://wiki.tcl.tk/11938) procedures have already
 # been sourced into the interpreter.
 #
 # Usage: Mount <existing directory> ?-keep <number>? ?-project <project name list>? ?-time <time>? <virtual versioning directory>
 #
 # Options:
 #
 # -keep
 #        maximum number of previous file versions to keep.
 #
 # -project
 #        a list of one or more identifying tags to associate with all files created or edited.
 #        If a file has project tags but none of them is included in the project
 #        name list, it will be invisible.
 #
 #        The project name list is stored in namespace variable $project($root).
 #        This variable can be edited while volume is still mounted, changes will
 #        will be taken into account dynamically.
 #
 #        In cases where several file versions each have multiple tags that match in the 
 #        list in $project($root), the version with the greatest number of matches will be visible.
 #
 #        Deleted files are marked with the tag "deleted".  If this value is in the project
 #        name list, then deleted files will become visible again.
 #
 #        If a file version exists without any project tags, it will be treated as the default, and
 #        will be visible even if tags are specified in $project($root), if no other version has matching tags.
 #
 # -time
 #        A timestamp in the form of [clock seconds], or a string understandable by [clock scan].
 #        Versions of files as they existed at the given time will be visible, rather than the
 #        default latest version.  
 #
 #        Value is stored in the namespace variable $time($root) and can be edited
 #        with volume still mounted.  Visible file versions will change to reflect edited value.

 package require vfs 1

 namespace eval ::vfs::template::version {}

 proc ::vfs::template::version::Mount {args} {
        eval [info body ::vfs::template::Mount]
 }

 namespace eval ::vfs::template::version {

 proc MountProcedure {args} {
        foreach templateProc "Mount Unmount CloseTrace handler" {
                set infoArgs [info args ::vfs::template::$templateProc]
                set infoBody [info body ::vfs::template::$templateProc]
                proc $templateProc $infoArgs $infoBody
        }

        set to [file normalize [lindex $args end]]
        set args [lrange $args 0 end-1]
        set path [file normalize [lindex $args 0]]
        set args [lrange $args 1 end]

        if [info exists ::vfs::_unmountCmd($to)] {::vfs::unmount $to}

        set argsLength [llength $args]
        for {set i 0} {$i < $argsLength} {incr i} {
                switch -- [lindex $args $i] {
                        -keep {
                                set keep [lindex $args [incr i]]
                                if ![string is digit -strict $keep] {continue}
                                set ::vfs::template::version::keep($to) $keep
                        }
                        -project {
                                set project [lindex $args [incr i]]
                                lappend ::vfs::template::version::project($to) $project
                        }
                        -time {
                                set time [lindex $args [incr i]]
                                SetTime $time
                                set ::vfs::template::version::time($to) $time
                        }
                }
        }

        if [catch {glob -directory $path -type {f hidden} .&dir*}] {
                CreateDirectory $path $to {} {}
        }

        lappend pathto $path
        lappend pathto $to
        return $pathto
 }

 proc UnmountProcedure {path to} {
        array unset ::vfs::template::version::keep $to
        array unset ::vfs::template::version::project $to
        array unset ::vfs::template::version::time $to
        return
 }

 proc Access {path root relative actualpath mode} {
        set fileName [VAcquireFile $path $root $relative $actualpath]
        set modeString [::vfs::accessMode $mode]
        if {$modeString == "F"} {
                if ![string first .&dir [file tail $fileName]] {error "no such file or directory"}
                if [file exists $fileName] {return}
                error "no such file or directory"
        }
        set modeString [split $modeString {}]
        set fileString {}
        if {[string equal $modeString "R"] && [file readable $fileName]} {return}
        if {[string equal $modeString "W"] && [file writable $fileName]} {return}
        if {[string equal $modeString "X"] && [file executable $fileName]} {return}
        file stat $fileName stat
        foreach { mask pairs } {
                00400 { 00400 r }
                00200 { 00200 w }
                04100 { 04100 s 04000 S 00100 x }
                00040 { 00040 r }
                00020 { 00020 w }
                02010 { 02010 s 02000 S 00010 x }
                00004 { 00004 r }
                00002 { 00002 w }
                01001 { 01001 t 01000 T 00001 x }
            } {
                set value [expr $stat(mode) & $mask]
                set bit -
                foreach { x b } $pairs {
                    if { $value == $x } {
                        set bit $b
                    }
                }
                append bitString $bit
      }
        set readable [regexp -all "r" $bitString]
        set writable [regexp -all "w" $bitString]
        set executable [regexp -all "x" $bitString]
        foreach {mode count} "R $readable W $writable X $executable" {
                if {([string first $mode $modeString] > -1) && !$count} {error "$mode access not allowed"}
        }
        if [string equal $modeString "X W"] {
                if {($writable == 3) && ($executable == 3)} {
                        return
                } elseif [file writable $fileName] {
                        if {[regexp -all "wx" $bitString] == $writable} {
                                return
                        } elseif [file executable $fileName] {
                                return
                        }
                }
        }
        if [string equal $modeString "R W"] {
                if {($writable == 3) && ($readable == 3)} {
                        return
                } elseif [file writable $fileName] {
                        if {[regexp -all "rw" $bitString] == $writable} {
                                return
                        } elseif [file readable $fileName] {
                                return
                        }
                }
        }
        if [string equal $modeString "R X"] {
                if {($readable == 3) && ($executable == 3)} {
                        return
                } elseif [file executable $fileName] {
                        if {[regexp -all {r[w-]x} $bitString] == $executable} {
                                return
                        } elseif [file readable $fileName] {
                                return
                        }
                }
        }

        foreach mS $modeString {
                set errorMessage "not [string map {R readable W writable X executable} $mS]"
                if {[lsearch $fileString $mS] == -1} {error $errorMessage}
        }
 }

 proc CreateDirectory {path root relative actualpath} {
        file mkdir [file join $path $relative]
        set fileName [VFileNameEncode [file join $path $relative]]
        set fileName [file join $fileName .&dir[file tail $fileName]]\;[VCreateTag $root]
        close [open $fileName w]
        catch {file attributes $fileName -hidden 1}
 }

 proc DeleteFile {path root relative actualpath} {
        set fileName [VFileNameEncode [file join $path $relative]]\;[VCreateTag $root]
        set fileName [split $fileName \;]
        set fileName [linsert $fileName 2 "deleted"]
        set fileName [join $fileName \;]
        close [open $fileName w]
 }

 proc FileAttributes {path root relative actualpath} {
        set fileName [VAcquireFile $path $root $relative $actualpath]
        array set attributes [file attributes $fileName]
        array unset attributes -longname
        array unset attributes -shortname
        return [array get attributes]
 }

 proc FileAttributesSet {path root relative actualpath attribute value} {
        if [info exists ::vfs::template::version::time($root)] {
                set existingTime $::vfs::template::version::time($root)
                set ::vfs::template::version::time($root) [clock seconds]
                set fileName [VAcquireFile $path $root $relative $actualpath]
                set ::vfs::template::version::time($root) $existingTime
        } else {
                set fileName [VAcquireFile $path $root $relative $actualpath]
        }
        file attributes $fileName $attribute $value
 }

 proc MatchInDirectory {path root relative actualpath pattern types} {
        if [::vfs::matchDirectories $types] {lappend typeString d}
        if [::vfs::matchFiles $types] {lappend typeString f}

        set globList [glob -directory [file join $path $relative] -nocomplain -types $typeString *]
        if [catch {::vfs::filesystem info $path}] {append globList " [glob -directory [file join $path $relative] -nocomplain -types "$typeString hidden" *]"}

        set newGlobList {}
        set acquireAttempts {}
        foreach gL $globList {
                if [file isdirectory $gL] {
                        set acquiredFile [VAcquireFile $path $root [list [file join $relative [file tail $gL]]] $actualpath]
                        if ![string equal $acquiredFile $gL] {continue}
                } else {
                        if ![string first .&dir [file tail $gL]] {continue}
                        set gL [VFileNameDecode $gL]
                        if {[lsearch $acquireAttempts $gL] > -1} {continue}
                        lappend acquireAttempts $gL
                        set acquiredFile [VAcquireFile $path $root [list [file join $relative [file tail $gL]]] $actualpath]
                        if [string equal $acquiredFile $gL] {continue}
                }
                set gL [eval file join \$root [lrange [file split $gL] [llength [file split $path]] end]]
                if [string match $pattern [file tail $gL]] {lappend newGlobList $gL}
        }
        return [lsort -unique $newGlobList]
 }

 proc Open {path root relative actualpath mode permissions} {

        if [catch {package present md5 2}] {package forget md5 ; package require md5 2}
        set fileName [VAcquireFile $path $root $relative $actualpath]
        if {$mode == "r"} {return [open $fileName]}
        set md5 {}

        if {$fileName == [file join $path $relative]} {
                set fileName [VFileNameEncode [file join $path $relative]]\;[VCreateTag $root]
                close [open $fileName $mode]
                catch {file attributes $fileName -permissions $permissions}
                set channelID [vfs::memchan]
                set ::vfs::template::version::filestats($channelID) "filename [list $fileName] -permissions $permissions md5 [list $md5]"
                return $channelID
        }

        set f [open $fileName r]
        fconfigure $f -translation binary
        set md5 [::md5::md5 -hex -- [read $f]]
        close $f
        set filed [vfs::memchan]
        if {[string index $mode 0] == "a"} {
                set f [open $fileName r]
                fconfigure $f -translation binary
                fconfigure $filed -translation binary
                fcopy $f $filed
                close $f
                seek $filed 0
        }
        file stat $fileName fileStats
        array set fileStats [file attributes $fileName]
        set fileStats(md5) $md5
        set ::vfs::template::version::filestats($filed) [array get fileStats]
        return $filed
 }

 proc Close {channelID path root relative actualpath mode} {
 # Do not close the channel in the close callback!
 # It will crash Tcl!  Honest!
 # The core will close the channel after you've taken what info you need from it.

 #        close $channelID
        if {$mode == "r"} {return}
        if [catch {package present md5 2}] {package forget md5 ; package require md5 2}
        array set fileStats $::vfs::template::version::filestats($channelID)

        unset ::vfs::template::version::filestats($channelID)

        if [string equal $fileStats(md5) {}] {
                file delete $fileStats(filename)
        }
        set cr \\x0D
        set lf \\x0A
        set crlf \\x0D\\x0A
        set outMode [lindex [fconfigure $channelID -translation] end]
        seek $channelID 0
        fconfigure $channelID -translation binary
        set md5 [::md5::md5 -hex -- [read $channelID]]
        if [string equal -nocase $md5 $fileStats(md5)] {return}

        set fileName [VFileNameEncode [file join $path $relative]]\;[VCreateTag $root]
        set f [open $fileName w]
        fconfigure $f -translation binary
        seek $channelID 0
        fcopy $channelID $f
        close $f
        catch {file attributes $fileName -readonly 0}
        catch {file attributes $fileName -permissions rw-rw-rw-}
        catch {file attributes $fileName -owner $fileStats(uid)}
        catch {file attributes $fileName -group $fileStats(gid)}
        catch {file atime $fileName $fileStats(atime)}

        foreach attr [array names fileStats] {
                if [string first "-" $attr] {continue}
                if ![string first "-permissions" $attr] {continue}
                if ![string first "-readonly" $attr] {continue}
                catch {file attributes $fileName $attr $fileStats($attr)}
        }
        catch {file attributes $fileName -permissions $fileStats(mode)}
        catch {file attributes $fileName -readonly $fileStats(-readonly)}

        return
 }

 proc RemoveDirectory {path root relative actualpath} {
        set fileName [VFileNameEncode [file join $path $relative]]
        set fileName [file join $fileName .&dir[file tail $fileName]]\;[VCreateTag $root]

        set fileName [split $fileName \;]
        set fileName [linsert $fileName 2 "deleted"]
        set fileName [join $fileName \;]
        close [open $fileName w]
        catch {file attributes $fileName -hidden 1}
 }

 proc Stat {path root relative actualpath} {
        set fileName [VAcquireFile $path $root $relative $actualpath]
        file stat $fileName fs
        return [array get fs]
 }

 proc Utime {path root relative actualpath atime mtime} {
        if [info exists ::vfs::template::version::time($root)] {
                set existingTime $::vfs::template::version::time($root)
                set ::vfs::template::version::time($root) [clock seconds]
                set fileName [VAcquireFile $path $root $relative $actualpath]
                set ::vfs::template::version::time($root) $existingTime
        } else {
                set fileName [VAcquireFile $path $root $relative $actualpath]
        }

        file atime $fileName $atime
        file mtime $fileName $mtime
 }

 proc SetTime {time} {
        if ![string is digit -strict $time] {catch {set time [clock scan $time]}}
        if ![string is digit -strict $time] {error "invalid time value."}
        set time [string range $time 0 [expr [string length [clock seconds]] - 1]]
        set time "[clock scan [clock format $time -format %Y%m%dT%H%M%S -gmt 1]]000"
 }

 proc VAcquireFile {path root relative actualpath} {
        set fileName [VFileNameEncode [file join $path $relative]]
        if [file isdirectory [file join $path $relative]] {
                set fileName [file join $fileName .&dir[file tail $fileName]]
                set relative [file join $relative .&dir[file tail $relative]]
        }
        set versions [glob -path $fileName -nocomplain -types f "\;*"]
        if [catch {::vfs::filesystem info $path}] {append versions " [glob -path $fileName -nocomplain -types "f hidden" "\;*"]"}

        set versions [string trim $versions]
        if {$versions == {}} {return [file join $path $relative]}

        set checkProject 0
        if [info exists ::vfs::template::version::project($root)] {
                set projects [string map {; &s} [string map {& &a} $::vfs::template::version::project($root)]]
                set checkProject 1
        }
        foreach ver $versions {
                set ver $root/[file tail $ver]
                lappend versionFiles $ver
                if !$checkProject {continue}
                foreach project $projects {
                        if {[lsearch [lrange [split $ver \;] 1 end] $project] > -1} {lappend projectFiles $ver}
                }
        }
        unset versions

        if ![catch {if {[llength $projectFiles] <= $::vfs::template::version::keep($root)} {error}}] {
                set keep $::vfs::template::version::keep($root)
                set projectFiles [lsort -decreasing -dictionary $projectFiles]
                set fileNumber [llength $projectFiles]
                for {set i [incr fileNumber -1]} {$i >= 0} {incr i -1} {
                        if {[llength $projectFiles] <= $keep} {break}
                        if ![catch {file delete $path/[lindex $projectFiles $i]}] {set projectFiles [lreplace $projectFiles $i $i]}
                }
        }

        set fileName [file tail [lindex [lsort -command VersionSort $versionFiles] 0]]

        if !$checkProject {
                if {[lindex [split $fileName \;] 2] == "deleted"} {return [file join $path $relative]}
        }

        if $checkProject {
                if {([lindex [split $fileName \;] 2] == "deleted") && ([lsearch $projects "deleted"] == -1)} {return [file join $path $relative]}
                set projectMember 0
                set tags [lrange [split $fileName \;] 1 end]
                if {[lindex $tags 1] == "deleted"} {set tags [lreplace $tags 1 1]}
                foreach project $projects {
                                if {[lsearch $tags $project] > -1} {set projectMember 1}
                }
                set projectLength [llength $tags]
                if {($projectLength > 1) && !$projectMember} {return [file join $path $relative]}
        }
        if ![string first .&dir [file tail $relative]] {
                set fileName [file join $path [file dirname $relative]]
                return [file normalize $fileName]
        }
        return [file join [file dirname [file join $path $relative]] $fileName]
 }

 proc VCreateTag {root} {
        set tag [clock scan [clock format [clock seconds] -format %Y%m%dT%H%M%S -gmt 1]][string range [clock clicks -milliseconds] end-2 end]
        if [info exists ::vfs::template::version::project($root)] {
                set projects [string map {; &s} [string map {& &a} $::vfs::template::version::project($root)]]
                set projectTag [join $projects \;]
                set tag [join "$tag $projectTag" \;]
        }
        return $tag
 }

 proc VersionSort {element1 element2} {
        set root [file dirname $element1]
        set element1 [file tail $element1]
        set element2 [file tail $element2]
        if [string equal $element1 $element2] {return 0}
        set sorted [lsort -dictionary -decreasing "$element1 $element2"]
        if {[lindex $sorted 0] == $element1} {set returnValue -1}
        if {[lindex $sorted 0] == $element2} {set returnValue 1}

        set time1 [lindex [split $element1 \;] 1]
        set time2 [lindex [split $element2 \;] 1]
        set time $time1
        if {$time2 > $time1} {set time $time2}
        if [info exists ::vfs::template::version::time($root)] {
                set returnValue -1
                set time $::vfs::template::version::time($root)
                if {!([string is digit -strict $time] && ([string length $time] == [string length $time1]))} {
                        set time [SetTime $::vfs::template::version::time($root)]
                }

                if {$time1 > $time} {set time1 [expr $time2 - 1]}
                if {($time2 <= $time) && ($time2 > $time1)} {set returnValue 1}
        }

        if [info exists ::vfs::template::version::project($root)] {
                set projects [string map {; &s} [string map {& &a} $::vfs::template::version::project($root)]]
                set project1 0
                set project2 0
                foreach project [lsort -unique $projects] {
                        set sumproject1 [lsearch [lrange [split $element1 \;] 1 end] $project]
                        set sumproject2 [lsearch [lrange [split $element2 \;] 1 end] $project]
                        incr sumproject1 ; incr sumproject2
                        if $sumproject1 {incr project1}
                        if $sumproject2 {incr project2}
                }
                if {$project1 > $project2} {set project1 1 ; set project2 0}
                if {$project1 < $project2} {set project1 0 ; set project2 1}

                set tagEnd1 [lindex [split $element1 \;] 2]
                if {$tagEnd1 == "deleted"} {set tagEnd1 [lindex [split $element1 \;] 3]}
                set tagEnd2 [lindex [split $element2 \;] 2]
                if {$tagEnd2 == "deleted"} {set tagEnd2 [lindex [split $element2 \;] 3]}
                if {($tagEnd1 == {}) && !($tagEnd2 == {})} {set returnValue -1}
                if {!($tagEnd1 == {}) && ($tagEnd2 == {})} {set returnValue 1}

                if {$project2 && !$project1 && ($time2 <= $time)} {set returnValue 1}
                if {$project1 && !$project2 && ($time1 <= $time)} {set returnValue -1}
        }
        return $returnValue
 }

 proc VFileNameEncode {filename} {
        set filename [file dirname $filename]/[string map {& &a} [file tail $filename]]
        set filename [file dirname $filename]/[string map {; &s} [file tail $filename]]
 }

 proc VFileNameDecode {filename} {
        set filename [file dirname $filename]/[lindex [split [file tail $filename] \;] 0]
        set filename [file dirname $filename]/[string map {&s ;} [file tail $filename]]
        set filename [file dirname $filename]/[string map {&a &} [file tail $filename]]
 }

 }
 # end namespace eval vfs::template::version

Wow! -jcw

Word - CMcC

escargo 17 Jul 2004 - I remember using the Digital Equipment Corporation's VMS operating system, which had a feature for saving multiple versions of files. You could set a limit on how many past versions you wanted retained, so that there would be some bounds on how much disk space you consumed. It might be worthwhile digging into old VMS manuals to see if there are utility functions related to versioned files that are worth implementing. The first that comes to mind would be purge, where only current versions of files are retained.

GWL 18 Jul 2004 - Actually the purge command took an argument specifing how many versions to keep (/keep=N if my memory is working), the default was just one.

escargo 18 Jul 2004 - It's that kind of detail from 15 years ago that made me suggest going to the manuals. Being able to set the limit on how many versions to keep would also be a good thing.


wcf3 18 Jul 2004 - I found a problem using glob while running on linux that I fixed by removing the file tail from the call to VFileNameDecode in the procedure MatchInDirectory. Everything worked great after that! I didn't want to alter the source until after I've had a chance to test it a bit more.

SEH I fixed the problem in the call to VFileNameDecode in MatchInDirectory. Thanks for the catch.

wcf3 20 Jul 2004 - It just keeps getting better! I would like to see another option that allows you to specify the age depth of the history files. So you could mount a versioning file system and lose history after some length of time, say 3 months old. This way files that are changed frequently over a short time will still have all versions for that time period. It would also be nice to be able to change the time option on the fly by a call to the mount procedures as well as query the versions for a file. I should probably let you know that I invented a versioning file system years back that is being sold as GoBack; I think Norton owns it now...all I can say is that my head is full of ideas for this! (whack me upside the head if I get out of control :-)

SEH GAAAHHH!!! Actually I'd like to limit the number and complexity of pruning options. It slows things down and disrupts the simplicity of the code. The nice thing about a scripted vfs is that the real location is perfectly accessible by standard means, so you could accomplish your age cleanup via a cron script or some such. I anticipate that a lot of administrative tasks are best done by direct operations on the existing file location. That's why I formatted the time tags as simple counts of milliseconds similar to [clock seconds] output.

Also, all configuration settings are accessible via namespace arrays, like ::vfs::template::version::time($root), and they can indeed be hot-swapped on the fly. So you could for example work on three separate projects branched from the main file base, then package each in separate output directories by changing the array value and copying the same directory three successive times.

I'm wondering now if it's worthwhile to make directories version and project aware like files.

AK: Wayback: A User-level Versioning File System for Linux @ http://www.usenix.org/events/usenix04/tech/freenix/cornell.html

wcf3 20 Jul 2004 - I've been playing with Wayback too...very nice indeed! I hope to have some time to build a nice GUI frontend for it and I've talked with the author about some other ideas for it (which he was already working on). As for accessing the internal variables to control the rollback time, I prefer to provide an API for stuff like that. It reminds me of the days when people would access a certain memory location for some value (remember getting the COM I/O port from the PC BIOS?)...easy, but a support/migration nightmare... Using a cron script to clean out old history is a much better idea; I wasn't thinking :-)

CMcC: I think the -project stuff overburdens this file system's semantics. I think a vfs which transforms a path, say by inserting global $project somewhere at the top of the hierarchy, could be used with collate/broadcast and a pure-versioning fs to achieve the same goal with less code complexity.


Category VFS Category Package