Version 61 of fileutil

Updated 2013-12-11 05:28:08 by pooryorick

fileutil , a Tcllib module, provides utilties for working with files and directories.

See Also

NR-grep: A Fast and Flexible Pattern Matching Tool
Unixy minitools

Documentation

fileutil official reference
fileutil::magic::cfront official reference
fileutil::magic::cgen official reference
fileutil::magic::filetype official reference
fileutil::magic::mimetype official reference
fileutil::magic::rt official reference
fileutil::multi official reference
fileutil::multi::op official reference

Modules

fileutil::traverse

See also the modules in the Documentation section, which don't yet have a wiki page

Commands

  • ::fileutil::fullnormalize path
  • ::fileutil::test path codes ? msgvar ? ? label ?
  • ::fileutil::cat ( ? options ? file)...
  • ::fileutil::writeFile ? options ? file data
  • ::fileutil::appendToFile ? options ? file data
  • ::fileutil::insertIntoFile ? options ? file at data
  • ::fileutil::removeFromFile ? options ? file at n
  • ::fileutil::replaceInFile ? options ? file at n data
  • ::fileutil::updateInPlace ? options ? file cmd
  • ::fileutil::fileType filename
  • ::fileutil::find ? basedir ? filtercmd ? ?
  • ::fileutil::findByPattern basedir ? -regexp|-glob ? ? -- ? patterns
  • ::fileutil::foreachLine var filename cmd
  • ::fileutil::grep pattern ? files ?
  • ::fileutil::install ? -m mode ? source destination
  • ::fileutil::stripN path n
  • ::fileutil::stripPwd path
  • ::fileutil::stripPath prefix path
  • ::fileutil::jail jail path
  • ::fileutil::touch ? -a ? ? -c ? ? -m ? ? -r ref_file ? ? -t time ? filename ? ... ?
  • ::fileutil::tempdir
  • ::fileutil::tempdir path
  • ::fileutil::tempdirReset
  • ::fileutil::tempfile ? prefix ?
  • ::fileutil::relative base dst
  • ::fileutil::relativeUrl base dst

What other file-related procs would be useful?

Other procs that would be useful to add would include wc, tee, head, tail, and perhaps some awk'ish type functions ala Tclx.

LV Anyone have a Tcl version of the dircmp command [L1 ]? I don't see it in the cygwin package list, and when I did a casual search on google.


VI 2003-11-28: Nice of you to ask. There's a list above, other than that: tail -f, split, join. I use tkcon as my main shell on a wimpy laptop. Fewer dlls loaded is good..


LV I think some procs emulating functionality (not necessary flags, etc.) of Unix commands such as:

  • cut - extract one or more columns of text from the input file
  • join - create the union of one or more files containing columns of data, using a common column as an index
  • sort - sort a file based on the contents of one or more columns
  • comm - extract rows of data common, or uncommon, between 2 or more files
  • uniq - extract unique rows (or count the occurances of unique rows) in a file

would be useful. Several of these commands have, at their core, the idea of files being a series of columns, separated by some character or position, and allow a person to select one or more specific columns upon which to perform functions. They represent, in a sense, shortcuts for various awk scripts.


Perhaps even some code like Glenn Jackman's:

proc touch {filename {time ""}} {
    if {[string length $time] == 0} {set time [clock seconds]}
    file mtime $filename $time
    file atime $filename $time
}

glennj: This proc has been accepted into tcllib 1.2: http://tcllib.sourceforge.net/doc/fileutil.html

US: Unix-like touch:

proc touch {filename {time ""}} {
    if {![file exists $filename]} {
        close [open $filename a]
    }
    if {[string length $time] == 0} {set time [clock seconds]}
    file mtime $filename $time
    file atime $filename $time
}

SS: 2003-12-16: Trying to improve over the Tcl implementation of wc in the Great Language Shootout I wrote this, that seems half in execution time against big files:

set text [read stdin]
set c [string length $text]
set l [expr {[llength [split $text "\n\r"]]-1}]
set T [split $text "\n\r\t "]
set w [expr {[llength $T]-[llength [lsearch -all -exact $T {}]]-1}]
puts "\t$l\t$w\t$c"

Output seems to be identical to GNU's wc command.


SEH 2006-07-23 -- The proc fileutil::find is useful, but it has several deficiencies:

  • On Windows, hidden files are mishandled.
  • On Windows, checks to avoid infinite loops due to nested symbolic links are not done.
  • On Unix, nested loop checking requires a "file stat" of each file/dir encountered, a significant performance hit.
  • The basedir from which the search starts is not included in the results, as it is with GNU find.
  • If the basedir is a file, it is returned in the result not as a list element (like glob) but as a string.
  • The proc calls itself recursively, and thus risks running into interp recursion limits for very large systems.
  • fileutil.tcl contains three separate instantiations of proc find for varying os's/versions. Maintenance nightmare.

The following code eliminates all the above deficiencies. It checks for nested symbolic links in a platform-independent way, and scans directory hierarchies without recursion.

For speed and simplicity, it takes advantage of glob's ability to use multiple patterns to scan deeply into a directory structure in a single command, hence the name. Its calling syntax is the same as fileutil::find, so with a name change it could be used as a drop-in replacement:

SEH 2008-01-20: globfind has been rewritten to achieve greater speed, simplicity and function, and moved to its own page.


gavino posted a question on comp.lang.tcl:

"I can not figure out the [globfind] syntax to limit it to finding say .pdf files. ... please someone post and [sic] example."

and Gerald Lester replied:

 proc PdfOnly {fileName} {
     return [string equal [string tolower [file extension $fileName] .pdf]
 }

 set fileList [globfind $dir PdfOnly] 

SEH 20070317 -- A simpler alternative:

 set fileList [globfind $dir {string match -nocase *.pdf}]

gavino 2011-03-21:

I could not get globfind to work with 8.6

I wrote this because on solaris 10 at work find sucks and is sometimes broken outright.

#! /home/g/tcl/bin/tclsh8.6.exe

#needs tcllib, I used 1.13 and cygwin at home, but use unix tcl+tcllib at work
package require fileutil
foreach file [fileutil::find /home/g {string match -nocase *.log}] {
    set filesize [file size $file]
    if {$filesize >= 1073741824} {
        set gigs [expr {$filesize / 1073741824}]
        puts "$gigs G $file"
    } elseif {$filesize >= 1048576} {
        set megs [expr {$filesize / 1048576}]
        puts "$megs M $file"
    } elseif {$filesize >= 1024} {
        set kilos [expr {$filesize / 1024}]
        puts "$kilos K $file"
    } else {
        puts "$filesize B $file"
    }
}

AMG: How is it misbehaving?


gavino I was in a directory and ran find and it didn't find the httpd.conf file I was looking at, let alone others, perms no doubt, but you think root find would find files anyhow? perhaps perms..


Laif: It should be noted by those who are not familiar with unix - that even in windows xp, if fileutil::find encounters a folder or file named with a single tilde (~), it will append the contents of the person's home directory to the search results. Furthermore, there is a risk of infinite recursion, if somewhere within your home folder, there is also a folder named with a single tilde.


gavino 2011-03-24:

faster, shorter, cooler version, if you pipe to sort -n especially fun: ./gavinfind.tcl|sort -n

#!/usr/local/bin/tclsh
#needs tcllib, I used 1.13
package require Tcl 8.5.9
package require fileutil
foreach file [fileutil::find /export/home/g] {
    set filesize [file size $file]
    if {$filesize > 1073741824} {
        puts "[expr {$filesize / 1073741824}] G $file"
    } elseif {$filesize > 1048576} {
        puts "[expr {$filesize / 1048576}] M $file"
    }
}

I guess shell works too, but maybe tcl finds files that shell misses? hmm

find /export/home/gschuette -size +1000000c -type f -exec ls -lh {} \;|awk '{print $5 " " $9}'|sort -n|grep -v [0-9]K