Version 27 of A versioning virtual filesystem

Updated 2004-11-05 00:13:39 by SEH

SEH A versioning virtual filesystem.

Similar to historical versioning filesystems, each edited version of a file is saved separately; each version file is tagged with a timestamp. A tag is a semi-colon and tag info appended to the file name. A deleted file is represented by a new zero-length file with timestamp and a tag reading "deleted". By default only the latest versions of files are visible. If the latest version is marked deleted, it is invisible.

Based on A template virtual filesystem


SEH 19 Jul 2004 - I added three optional mount flags: -keep , -project , and -time

Usage: Mount <existing directory> -keep <integer> -project <string> -time <seconds or string acceptable to clock scan> <virtual versioning directory>

-keep: sets the maximum number of versions per file to retain. If the -project tag is used, other projects' versions are ignored.

-project: specify a named project and all new edits get tagged with this name string. As long as you're mounted with this option, versions with the corresponding tag will be preferred, and edits tagged by other projects will be invisible. Multiple -project tags can be used; new versions will tagged with all specified project tags - thus you can share edits among several projects.

-time: specify a time in the past and the vfs will appear just as it did at that time. You can make new edits, but they will be invisible until you remount without a time specified. Can be used in conjunction with -project.


SEH 4 Nov 2004 - Cleaned up and tested.


 # A versioning virtual filesystem.

 # Usage: Mount <existing directory> ?-keep <number>? ?-project <project name list>? ?-time <time>? <virtual versioning directory>

 # Options:
 # -keep
 #        maximum number of previous file versions to keep.
 # -project
 #        a list of one or more identifying tags to associate with all files created or edited.
 #        If a file has project tags but none of them is included in the project
 #        name list, it will be invisible.
 #
 #        Deleted files are marked with the tag "deleted".  If this value is in the project
 #        name list, then deleted files will become visible again.
 #
 #        The project name list is stored in namespace variable $project($root).
 #        This variable can be edited while volume is still mounted, changes will
 #        will be taken into account dynamically.
 # -time
 #        A timestamp in the form of [clock seconds], or a string understandable by [clock scan].
 #        Versions of files as they existed at the given time will be visible, rather than the
 #        default latest version.  Value is stored in the namespace value $time($root) and can be edited
 #        with volume still mounted.  Visible file versions will change to reflect edited value.

 # Similar to historical versioning filesystems, each edited version of a file is saved separately;
 # each version file is tagged with a timestamp.
 # A deleted file is represented by a new zero-length file with timestamp and a tag reading "deleted".
 # By default only the latest version is visible.  If the latest version is marked deleted, it is invisible.

 package require vfs 1

 namespace eval ::vfs::template::version {}

 proc ::vfs::template::version::Mount {args} {
        set volume {}
        if {[lindex $args 0] == "-volume"} {set volume "-volume"}
        set pathto [eval MountProcedure $args]
        set path [lindex $pathto 0]
        set to [lindex $pathto 1]
        eval ::vfs::filesystem mount $volume \$to \[list ::vfs::template::version::handler \$path\]
        ::vfs::RegisterMount $to [list ::vfs::template::version::Unmount]
        if {[trace info execution close] == {}} {
                trace add execution close leave ::vfs::template::version::CloseTrace
        }
        return $to
 }

 proc ::vfs::template::version::Unmount {to} {
        set to [file normalize $to]
        set path [lindex [::vfs::filesystem info $to] end]
        UnmountProcedure $path $to
        ::vfs::filesystem unmount $to
 }

 proc ::vfs::template::version::CloseTrace {commandString code result op} {
        set errorCode $::errorCode
        set errorInfo $::errorInfo
        set channelID [lindex $commandString 1]
        if [regexp {::Close ([^ ]+?) } $errorInfo trash errorChannelID] {
                if [string equal $channelID $errorChannelID] {
                        if {[lindex $errorCode 0] == "POSIX"} {
                                set pError [lindex $errorCode 1]
                                ::vfs::filesystem posixerror $::vfs::posix($pError) ; return -code error $::vfs::posix($pError)
                        }
                        set message [lindex [split $errorInfo \n] 0]
                        error $message $errorInfo $errorCode
                }
        }
 }

 proc ::vfs::template::version::handler {path cmd root relative actualpath args} {
        switch -- $cmd {
                access {
                        set mode [lindex $args 0]
                        set error [catch {Access $path $root $relative $actualpath $mode}]
                        if $error {::vfs::filesystem posixerror $::vfs::posix(EACCES) ; return -code error $::vfs::posix(EACCES)}
                }
                createdirectory {
                        CreateDirectory $path $root $relative $actualpath
                }
                deletefile {
                        DeleteFile $path $root $relative $actualpath
                }
                fileattributes {
                        set index [lindex $args 0]
                        set value [lindex $args 1]
                        array set attributes [FileAttributes $path $root $relative $actualpath]
                        if {$index == {}} {
                                return [lsort [array names attributes]]
                        }
                        set attribute [lindex [lsort [array names attributes]] $index]
                        if {$value == {}} {
                                return $attributes($attribute)
                        }
                        FileAttributesSet $path $root $relative $actualpath $attribute $value
                }
                matchindirectory {
                        set pattern [lindex $args 0]
                        set types [lindex $args 1]
                        return [MatchInDirectory $path $root $relative $actualpath $pattern $types]
                } open {
                        set mode [lindex $args 0]
                        if {$mode == {}} {set mode r}
                        set permissions [lindex $args 1]
                        set channelID [Open $path $root $relative $actualpath $mode $permissions]
                        switch -glob -- $mode {
                                "" -
                                "r*" -
                                "w*" {
                                        seek $channelID 0
                                }
                                "a*" {
                                            seek $channelID 0 end
                                }
                                default {
                                        ::vfs::filesystem posixerror $::vfs::posix(EINVAL)
                                        return -code error $::vfs::posix(EINVAL)
                                }
                        }
                        set result $channelID
                        lappend result [list ::vfs::template::version::Close $channelID $path $root $relative $actualpath $mode]
                        return $result
                }
                removedirectory {
                        set recursive [lindex $args 0]
                        if !$recursive {
                                if {[MatchInDirectory $path $root $relative $actualpath * 0] != {}} {
                                        ::vfs::filesystem posixerror $::vfs::posix(EEXIST)
                                        return -code error $::vfs::posix(EEXIST)
                                }
                        }
                        RemoveDirectory $path $root $relative $actualpath
                }
                stat {
                        return [Stat $path $root $relative $actualpath]
                }
                utime {
                        set atime [lindex $args 0]
                        set mtime [lindex $args 1]
                        Utime $path $root $relative $actualpath $atime $mtime
                }
        }
 }

 namespace eval ::vfs::template::version {

 proc MountProcedure {args} {
        set to [file normalize [lindex $args end]]
        set args [lrange $args 0 end-1]
        set path [lindex $args 0]
        set args [lrange $args 1 end]

        if [info exists ::vfs::_unmountCmd($to)] {::vfs::unmount $to}

        set argsLength [llength $args]
        for {set i 0} {$i < $argsLength} {incr i} {
                switch -- [lindex $args $i] {
                        -keep {
                                set keep [lindex $args [incr i]]
                                if ![string is digit -strict $keep] {continue}
                                set ::vfs::template::version::keep($to) $keep
                        }
                        -project {
                                set project [lindex $args [incr i]]
                                lappend ::vfs::template::version::project($to) $project
                        }
                        -time {
                                set time [lindex $args [incr i]]
                                SetTime $time
                                set ::vfs::template::version::time($to) $time
                        }
                }
        }

        file mkdir $path

        lappend pathto $path
        lappend pathto $to
        return $pathto
 }

 proc UnmountProcedure {path to} {
        array unset ::vfs::template::version::keep $to
        array unset ::vfs::template::version::project $to
        array unset ::vfs::template::version::time $to
        array unset ::vfs::_unmountCmd $to
        return
 }

 proc Access {path root relative actualpath mode} {
        set modeString [::vfs::accessMode $mode]
        if {$modeString == "F"} {set modeString RWX}
        set modeString [split $modeString {}]
        set fileName [VAcquireFile $path $root $relative $actualpath]
        if [file readable $fileName] {lappend fileString R}
        if [file writable $fileName] {lappend fileString W}
        if [file executable $fileName] {lappend fileString X}
        foreach mS $modeString {
                set errorMessage "not [string map {R readable W writable X executable} $mS]"
                if {[lsearch $fileString $mS] == -1} {error $errorMessage}
        }
 }

 proc CreateDirectory {path root relative actualpath} {
        file mkdir [file join $path $relative]
 }

 proc DeleteFile {path root relative actualpath} {
        set fileName [VFileNameEncode [file join $path $relative]]\;[VCreateTag $root]
        set fileName [split $fileName \;]
        set fileName [linsert $fileName 2 "deleted"]
        set fileName [join $fileName \;]
        close [open $fileName w]
 }

 proc FileAttributes {path root relative actualpath} {
        set fileName [VAcquireFile $path $root $relative $actualpath]
        array set attributes [file attributes $fileName]
        array unset attributes -longname
        array unset attributes -shortname
        return [array get attributes]
 }

 proc FileAttributesSet {path root relative actualpath attribute value} {
        if [info exists ::vfs::template::version::time($root)] {
                set existingTime $::vfs::template::version::time($root)
                set ::vfs::template::version::time($root) [clock seconds]
                set fileName [VAcquireFile $path $root $relative $actualpath]
                set ::vfs::template::version::time($root) $existingTime
        } else {
                set fileName [VAcquireFile $path $root $relative $actualpath]
        }
        file attributes $fileName $attribute $value
 }

 proc MatchInDirectory {path root relative actualpath pattern types} {
        if [::vfs::matchDirectories $types] {lappend typeString d}
        if [::vfs::matchFiles $types] {lappend typeString f}

        set globList "[glob -directory [file join $path $relative] -nocomplain -types $typeString *] [glob -directory [file join $path $relative] -nocomplain -types "$typeString hidden" *]"
        set pathLength [expr [string length $path] - 1]
        set newGlobList {}
        set acquireAttempts {}
        foreach gL $globList {
                if [file isfile $gL] {
                        set gL [VFileNameDecode $gL]
                        if {[lsearch $acquireAttempts $gL] > -1} {continue}
                        lappend acquireAttempts $gL
                        set acquiredFile [VAcquireFile $path $root [list [file join $relative [file tail $gL]]] $actualpath]
                        if [string equal $acquiredFile $gL] {continue}
                }
                set gL [string replace $gL 0 $pathLength $root]
                if [string match $pattern [file tail $gL]] {lappend newGlobList $gL}
        }
        return [lsort -unique $newGlobList]
 }

 proc Open {path root relative actualpath mode permissions} {
        if [catch {package present md5 2}] {package forget md5 ; package require md5 2}
        set fileName [VAcquireFile $path $root $relative $actualpath]
        if {$mode == "r"} {return [open $fileName]}
        if {$fileName == [file join $path $relative]} {
                set channelID [vfs::memchan]
                set ::vfs::template::version::filestats($channelID) "-permissions $permissions"
                return $channelID
        }
        set f [open $fileName r]
        fconfigure $f -translation binary
        set filed [vfs::memchan]
        fconfigure $filed -translation binary
        fcopy $f $filed
        close $f
        seek $filed 0
        set md5 [::md5::md5 -hex -- [read $filed]]
        fconfigure $filed -translation auto
        seek $filed 0
        file stat $fileName fileStats
        array set fileStats [file attributes $fileName]
        set fileStats(md5) $md5
        set ::vfs::template::version::filestats($filed) [array get fileStats]
        return $filed
 }

 proc Close {channelID path root relative actualpath mode} {
 # Do not close the channel in the close callback!
 # It will crash Tcl!  Honest!
 # The core will close the channel after you've taken what info you need from it.

 #        close $channelID
        if {$mode == "r"} {return}
        if [catch {package present md5 2}] {package forget md5 ; package require md5 2}
        array set fileStats $::vfs::template::version::filestats($channelID)
        unset ::vfs::template::version::filestats($channelID)
        fconfigure $channelID -translation binary
        seek $channelID 0
        set md5 [::md5::md5 -hex -- [read $channelID]]
        if [string equal -nocase $md5 $fileStats(md5)] {return}

        set fileName [VFileNameEncode [file join $path $relative]]\;[VCreateTag $root]
        set f [open $fileName w]
        fconfigure $f -translation binary
        seek $channelID 0
        fcopy $channelID $f
        close $f
        catch {file attributes $fileName -readonly 0}
        catch {file attributes $fileName -permissions rw-rw-rw-}
        catch {file attributes $fileName -owner $fileStats(uid)}
        catch {file attributes $fileName -group $fileStats(gid)}
        catch {file atime $fileName $fileStats(atime)}

        foreach attr [array names fileStats] {
                if [string first "-" $attr] {continue}
                if ![string first "-permissions" $attr] {continue}
                if ![string first "-readonly" $attr] {continue}
                catch {file attributes $fileName $attr $fileStats($attr)}
        }
        catch {file attributes $fileName -permissions $fileStats(mode)}
        catch {file attributes $fileName -readonly $fileStats(-readonly)}

        return
 }

 proc RemoveDirectory {path root relative actualpath} {
        catch {file delete [file join $path $relative]}
 }

 proc Stat {path root relative actualpath} {
        set fileName [VAcquireFile $path $root $relative $actualpath]
        file stat $fileName fs
        return [array get fs]
 }

 proc Utime {path root relative actualpath atime mtime} {
        if [info exists ::vfs::template::version::time($root)] {
                set existingTime $::vfs::template::version::time($root)
                set ::vfs::template::version::time($root) [clock seconds]
                set fileName [VAcquireFile $path $root $relative $actualpath]
                set ::vfs::template::version::time($root) $existingTime
        } else {
                set fileName [VAcquireFile $path $root $relative $actualpath]
        }

        file atime $fileName $atime
        file mtime $fileName $mtime
 }

 proc SetTime {time} {
        if ![string is digit -strict $time] {catch {set time [clock scan $time]}}
        if ![string is digit -strict $time] {error "invalid time value."}
        set time "[clock scan [clock format $time -format %Y%m%dT%H%M%S -gmt 1]]000"
 }

 proc VAcquireFile {path root relative actualpath} {
        if [file isdirectory [file join $path $relative]] {return [file join $path $relative]}
        set fileName [VFileNameEncode [file join $path $relative]]
        set versions [glob -path $fileName -nocomplain -types f "\;*"]
        if {$versions == {}} {return [file join $path $relative]}

        set checkProject 0
        if [info exists ::vfs::template::version::project($root)] {
                set projects [string map {; &s} [string map {& &a} $::vfs::template::version::project($root)]]
                set checkProject 1
        }
        foreach ver $versions {
                set ver $root/[file tail $ver]
                lappend versionFiles $ver
                if $checkProject {
                        foreach project $projects {
                                if {[lsearch [lrange [split $ver \;] 1 end] $project] > -1} {lappend projectFiles $ver}
                        }
                }
        }
        unset versions

        if ![catch {if {[llength $projectFiles] <= $::vfs::template::version::keep($root)} {error}}] {
                set keep $::vfs::template::version::keep($root)
                set projectFiles [lsort -decreasing -dictionary $projectFiles]
                set fileNumber [llength $projectFiles]
                for {set i [incr fileNumber -1]} {$i >= 0} {incr i -1} {
                        if {[llength $projectFiles] <= $keep} {break}
                        if ![catch {file delete $path/[lindex $projectFiles $i]}] {set projectFiles [lreplace $projectFiles $i $i]}
                }
        }

        set fileName [file tail [lindex [lsort -command VersionSort $versionFiles] 0]]
        if {([lindex [split $fileName \;] 2] == "deleted") && ([lsearch $::vfs::template::version::project($root) "deleted"] == -1)} {
                return [file join $path $relative]
        }
        if {$checkProject} {
                set projectMember 0
                set tags [lrange [split $fileName \;] 1 end]
                if {[lindex $tags 1] == "deleted"} {set tags [lreplace $tags 1 1]}
                foreach project $projects {
                                if {[lsearch $tags $project] > -1} {set projectMember 1}
                }
                set projectLength [llength $tags]
                if {($projectLength > 1) && !$projectMember} {return [file join $path $relative]}
        }
        return [file join [file dirname [file join $path $relative]] $fileName]
 }

 proc VCreateTag {root} {
        set tag [clock scan [clock format [clock seconds] -format %Y%m%dT%H%M%S -gmt 1]][string range [clock clicks -milliseconds] end-2 end]
        if [info exists ::vfs::template::version::project($root)] {
                set projects [string map {; &s} [string map {& &a} $::vfs::template::version::project($root)]]
                set projectTag [join $projects \;]
                set tag [join "$tag $projectTag" \;]
        }
        return $tag
 }

 proc VersionSort {element1 element2} {
        set root [file dirname $element1]
        set element1 [file tail $element1]
        set element2 [file tail $element2]
        if [string equal $element1 $element2] {return 0}
        set sorted [lsort -dictionary -decreasing "$element1 $element2"]
        if {[lindex $sorted 0] == $element1} {set returnValue -1}
        if {[lindex $sorted 0] == $element2} {set returnValue 1}

        set time1 [lindex [split $element1 \;] 1]
        set time2 [lindex [split $element2 \;] 1]
        set time $time1
        if {$time2 > $time1} {set time $time2}
        if [info exists ::vfs::template::version::time($root)] {
                set returnValue -1
                set time [SetTime $::vfs::template::version::time($root)]
                if {$time1 > $time} {set time1 [expr $time2 - 1]}
                if {($time2 <= $time) && ($time2 > $time1)} {set returnValue 1}
        }

        if [info exists ::vfs::template::version::project($root)] {
                set projects [string map {; &s} [string map {& &a} $::vfs::template::version::project($root)]]
                foreach project $projects {
                        set project1 [lsearch [lrange [split $element1 \;] 1 end] $project]
                        set project2 [lsearch [lrange [split $element2 \;] 1 end] $project]
                        incr project1 ; incr project2
                        if {$project2 && !$project1 && ($time2 <= $time)} {set returnValue 1 ; break}
                        if {$project1 && !$project2 && ($time1 <= $time)} {set returnValue -1 ; break}
                        set tagEnd1 [lindex [split $element1 \;] 2]
                        if {$tagEnd1 == "deleted"} {set tagEnd1 [lindex [split $element1 \;] 3]}
                        set tagEnd2 [lindex [split $element2 \;] 2]
                        if {$tagEnd2 == "deleted"} {set tagEnd2 [lindex [split $element2 \;] 3]}
                        if {($tagEnd1 == {}) && !($tagEnd2 == {})} {set returnValue -1}
                        if {!($tagEnd1 == {}) && ($tagEnd2 == {})} {set returnValue 1}
                }
        }
        return $returnValue
 }

 proc VFileNameEncode {filename} {
        set filename [file dirname $filename]/[string map {& &a} [file tail $filename]]
        set filename [file dirname $filename]/[string map {; &s} [file tail $filename]]
 }

 proc VFileNameDecode {filename} {
        set filename [file dirname $filename]/[lindex [split [file tail $filename] \;] 0]
        set filename [file dirname $filename]/[string map {&s ;} [file tail $filename]]
        set filename [file dirname $filename]/[string map {&a &} [file tail $filename]]
 }

 }
 # end namespace eval vfs::template::version

Wow! -jcw

Word - CMcC

escargo 17 Jul 2004 - I remember using the Digital Equipment Corporation's VMS operating system, which had a feature for saving multiple versions of files. You could set a limit on how many past versions you wanted retained, so that there would be some bounds on how much disk space you consumed. It might be worthwhile digging into old VMS manuals to see if there are utility functions related to versioned files that are worth implementing. The first that comes to mind would be purge, where only current versions of files are retained.

GWL 18 Jul 2004 - Actually the purge command took an argument specifing how many versions to keep (/keep=N if my memory is working), the default was just one.

escargo 18 Jul 2004 - It's that kind of detail from 15 years ago that made me suggest going to the manuals. Being able to set the limit on how many versions to keep would also be a good thing.


wcf3 18 Jul 2004 - I found a problem using glob while running on linux that I fixed by removing the file tail from the call to VFileNameDecode in the procedure MatchInDirectory. Everything worked great after that! I didn't want to alter the source until after I've had a chance to test it a bit more.

SEH I fixed the problem in the call to VFileNameDecode in MatchInDirectory. Thanks for the catch.

wcf3 20 Jul 2004 - It just keeps getting better! I would like to see another option that allows you to specify the age depth of the history files. So you could mount a versioning file system and lose history after some length of time, say 3 months old. This way files that are changed frequently over a short time will still have all versions for that time period. It would also be nice to be able to change the time option on the fly by a call to the mount procedures as well as query the versions for a file. I should probably let you know that I invented a versioning file system years back that is being sold as GoBack; I think Norton owns it now...all I can say is that my head is full of ideas for this! (whack me upside the head if I get out of control :-)

SEH GAAAHHH!!! Actually I'd like to limit the number and complexity of pruning options. It slows things down and disrupts the simplicity of the code. The nice thing about a scripted vfs is that the real location is perfectly accessible by standard means, so you could accomplish your age cleanup via a cron script or some such. I anticipate that a lot of administrative tasks are best done by direct operations on the existing file location. That's why I formatted the time tags as simple counts of milliseconds similar to [clock seconds] output.

Also, all configuration settings are accessible via namespace arrays, like ::vfs::template::version::time($root), and they can indeed be hot-swapped on the fly. So you could for example work on three separate projects branched from the main file base, then package each in separate output directories by changing the array value and copying the same directory three successive times.

I'm wondering now if it's worthwhile to make directories version and project aware like files.

AK: Wayback: A User-level Versioning File System for Linux @ http://www.usenix.org/events/usenix04/tech/freenix/cornell.html

wcf3 20 Jul 2004 - I've been playing with Wayback too...very nice indeed! I hope to have some time to build a nice GUI frontend for it and I've talked with the author about some other ideas for it (which he was already working on). As for accessing the internal variables to control the rollback time, I prefer to provide an API for stuff like that. It reminds me of the days when people would access a certain memory location for some value (remember getting the COM I/O port from the PC BIOS?)...easy, but a support/migration nightmare... Using a cron script to clean out old history is a much better idea; I wasn't thinking :-)

CMcC: I think the -project stuff overburdens this file system's semantics. I think a vfs which transforms a path, say by inserting global $project somewhere at the top of the hierarchy, could be used with collate/broadcast and a pure-versioning fs to achieve the same goal with less code complexity.


Category VFS