Version 17 of fileutil

Updated 2006-07-24 04:38:20

Documentation can be found at http://tcllib.sourceforge.net/doc/fileutil.html


Currently the fileutil package contains a find, grep, and cat. Other procs that would be useful to add would include wc, tee, head, tail, and perhaps some awk'ish type functions ala Tclx.

For more on grep, see "NR-grep: A Fast and Flexible Pattern Matching Tool" [L1 ].


Perhaps even some code like Glenn Jackman's:

    proc touch {filename {time ""}} {
        if {[string length $time] == 0} {set time [clock seconds]}
        file mtime $filename $time
        file atime $filename $time
    }

glennj: This proc has been accepted into tcllib 1.2: http://tcllib.sourceforge.net/doc/fileutil.html

US Unix-like touch:

    proc touch {filename {time ""}} {
        if {![file exists $filename]} {
           close [open $filename a]
           }
        if {[string length $time] == 0} {set time [clock seconds]}
        file mtime $filename $time
        file atime $filename $time
    }

What other file-related procs would be useful?


2003-11-28 VI Nice of you to ask. There's a list above, other than that: tail -f, split, join. I use tkcon as my main shell on a wimpy laptop. Fewer dlls loaded is good..


2003-12-16 SS Trying to improve over the Tcl implementation of wc in the Great Language Shootout I wrote this, that seems half in execution time against big files:

 set text [read stdin]
 set c [string length $text]
 set l [expr {[llength [split $text "\n\r"]]-1}]
 set T [split $text "\n\r\t "]
 set w [expr {[llength $T]-[llength [lsearch -all -exact $T {}]]-1}]
 puts "\t$l\t$w\t$c"

Output seems to be identical to GNU's wc command.


SEH 20060723 -- The proc fileutil::find is useful, but it has several deficiencies:

  • On Windows, hidden files are mishandled.
  • On Windows, checks to avoid infinite loops due to nested symbolic links are not done.
  • On Unix, nested loop checking requires a "file stat" of each file/dir encountered, a significant performance hit.
  • The proc calls itself recursively, and thus risks running into interp recursion limits for very large systems.
  • fileutil.tcl contains three separate instantiations of proc find for varying os's/versions. Maintenance nightmare.

The following code eliminates all the above deficiencies. It checks for nested symbolic links in a platform-independent way, and scans directory hierarchies without recursion.

For speed and simplicity, it takes advantage of glob's ability to use multiple patterns to scan deeply into a directory structure in a single command, hence the name. Its calling syntax is the same as fileutil::find, so with a name change it could be used as a drop-in replacement:

 proc globfind {{basedir .} {filtercmd {}}} {
        set depth 16
        set filt [string length $filtercmd]
        set basedir [file normalize $basedir]
        file stat $basedir fs
        set linkName $basedir
        while {$fs(type) == "link"} {
                if [catch {file stat [set linkName [file normalize [file link $linkName]]] fs}] {break}
        }
        if {$fs(type) == "file"} {
                set filename $basedir
                if {!$filt || [uplevel $filtercmd [list $filename]]} {
                            return $filename
                }
        }
        set globPatternTotal {}
        set globPattern *
        set incrPattern /*
        for {set i 0} {$i < $depth} {incr i} {
                lappend globPatternTotal $globPattern
                append globPattern $incrPattern
        }

        lappend checkDirs $basedir
        set returnFiles {}
        set redo 0
        set terminate 0
        set hidden {}
        while {!$terminate} {
                set currentDir [lindex $checkDirs 0]
                if !$redo {set allFiles [eval glob -directory [list $currentDir] -nocomplain $hidden $globPatternTotal]}
                set redo 0
                set termFile [lindex $allFiles end]
                set termFile [lrange [file split $termFile] [llength [file split $currentDir]] end]
                if {$hidden != {}} {
                        set checkDirs [lrange $checkDirs 1 end]
                }
                foreach test {checkdirs length duplicate recursion prune} {
                        switch $test {
                                checkdirs {
                                        set afIndex [llength $allFiles]
                                        incr afIndex -1
                                        for {set i $afIndex} {$i >= 0} {incr i -1} {
                                                set cdir [lindex $allFiles $i]
                                                if {[llength [lrange [file split $cdir] [llength [file split $currentDir]] end]] < $depth} {break}
                                                file stat $cdir fs
                                                set linkName $cdir
                                                while {$fs(type) == "link"} {
                                                        if [catch {file stat [set linkName [file normalize [file link $linkName]]] fs}] {break}
                                                }
                                                if {$fs(type) == "directory"} {lappend checkDirs $cdir}
                                        }
                                }                                        
                            length {
                                        if {[llength $termFile] < $depth} {break}
                                }
                            duplicate {
                                        set recurseTest 0
                                        set dupFile [lindex $allFiles end]
                                        set dupFile [lrange [file split $dupFile] [llength [file split $basedir]] end]
                                        set dupFileEndDir [expr [llength $dupFile] - 2]
                                        if {[lsearch $dupFile [lindex $dupFile end-1]] < $dupFileEndDir} {
                                          set recurseTest 1
                                        }
                                }
                            recursion {
                                        if !$recurseTest {continue}
                                        if {($hidden == {})} {set type "-types l"} else {set type "-types [list "hidden l"]"}

                                        set linkFiles {}
                                        set linkDir $currentDir
                                        while 1 {
                                                set linkFiles [concat $linkFiles [eval glob -directory [list $linkDir] -nocomplain $type $globPatternTotal]]
                                                if {$linkDir == $basedir} {break}
                                                set linkDir [file dirname $linkDir]
                                        }
                                        array unset links
                                        set linkFiles [lsort -unique $linkFiles]
                                        foreach lf $linkFiles {
                                                set ltarget [file normalize [file readlink $lf]]
                                                if {[array names links -exact $ltarget] != {}} {
                                                        lappend pruneLinks $lf
                                                        set redo 1
                                                }
                                                array set links "$ltarget $lf"
                                        }
                               }
                                prune {
                                        if ![info exists pruneLinks] {continue}
                                        set afIndex [llength $allFiles]
                                        incr afIndex -1
                                        set cdIndex [llength $checkDirs]
                                        incr cdIndex -1
                                        set rfIndex [llength $returnFiles]
                                        incr rfIndex -1
                                        foreach pl $pruneLinks {
                                                for {set i $afIndex} {$i >= 0} {incr i -1} {
                                                        set af [lindex $allFiles $i]
                                                        if ![string first $pl/ $af] {set allFiles [lreplace $allFiles $i $i]}
                                                }
                                                for {set i $cdIndex} {$i >= 0} {incr i -1} {
                                                        set cd [lindex $checkDirs $i]
                                                        if ![string first $pl/ $cd] {set checkDirs [lreplace $checkDirs $i $i]}
                                                }
                                                for {set i $rfIndex} {$i >= 0} {incr i -1} {
                                                        set rf [lindex $returnFiles $i]
                                                        if ![string first $pl/ $rf] {set returnFiles [lreplace $returnFiles $i $i]}
                                                }
                                        }
                                        unset pruneLinks
                                }
                            default {}
                        }
                }
                if $redo continue
                if {$hidden == {}} {
                        set hidden "-types hidden"
                } else {
                        set hidden {}
                        if {[llength $checkDirs] == 0} {set terminate 1}
                }
                set returnFiles [concat $returnFiles $allFiles]
        }
        set filterFiles {}
        foreach filename [lsort -unique $returnFiles] {
                if {!$filt || [uplevel $filtercmd [list $filename]]} {
                        lappend filterFiles $filename
                }
        }
        return $filterFiles
 }

Category Package, subset Tcllib, Category File