globfind is designed as a fast and simple alternative to fileutil::find.
globfind is a directory hierarchy search utility. It takes advantage of glob's ability to use multiple patterns to scan deeply into a directory structure in a single command, hence the name.
Version 2.0 is a rewrite from scratch to be faster and more compact, featureful and error-resilient.
On searches of large directory spaces for files matching a glob pattern, globfind typically runs about three times faster than fileutil::findByPattern, and about 150% of the speed of GNU find.
Support for Tcl versions before 8.5 has been dropped.
Link to previous version of page
fileutil::find is useful but it has several deficiencies:
globfind eliminates all the above deficiencies. It checks for nested symbolic links in a platform-independent way and scans directory hierarchies without recursion.
For speed and simplicity, globfind takes advantage of glob's ability to use multiple patterns to scan deeply into a directory structure in a single command, hence the name. Its calling syntax is the same as fileutil::find, so with a name change it could be used as a drop-in replacement:
Usage: globfind ?basedir? ?filtercmd? ?switches? Options: basedir - the directory from which to start the search. Defaults to current directory. filtercmd - Tcl command; for each file found in the basedir, the filename will be appended to filtercmd and the result will be evaluated. The evaluation should return a boolean value; only files whose return code is true will be included in the final return result. ex: {file isdir} switches - The switches will "prefilter" the results before the filtercmd is applied. The available switches are: -depth - sets the number of levels down from the basedir into which the filesystem hierarchy will be searched. A value of zero is interpreted as infinite depth. -pattern - a glob-style filename-matching wildcard. ex: -pattern *.pdf -types - any value acceptable to the "types" switch of the glob command. ex: -types {d hidden} -redundancy - eliminates redundant listing of real files that may occur due to symbolic links that link to directories within basedir (at the cost of slower execution). Stores names of such symbolic links in ::fileutil::globfind::redundant_files. Sets ::fileutil::globfind::REDUNDANCY to 1 if redundancies found, otherwise 0.
AQI 2016-07-16: rglob is another much faster alternative to fileutil::find.
proc rglob {basedir pattern} { # Fix the directory name, this ensures the directory name is in the # native format for the platform and contains a final directory seperator set basedir [string trimright [file join [file normalize $basedir] { }]] set fileList {} # Look in the current directory for matching files, -type {f r} # means ony readable normal files are looked at, -nocomplain stops # an error being thrown if the returned list is empty foreach fileName [glob -nocomplain -type {f r} -path $basedir $pattern] { lappend fileList $fileName } # Now look for any sub direcories in the current directory foreach dirName [glob -nocomplain -type {d r} -path $basedir *] { # Recusively call the routine on the sub directory and append any # new files to the results set subDirList [rglob $dirName $pattern] if { [llength $subDirList] > 0 } { foreach subDirFile $subDirList { lappend fileList $subDirFile } } } return $fileList }
example:
time {util::rglob [pwd] *.tcl} 100 1505.3 microseconds per iteration time {fileutil::findByPattern [pwd] *.tcl} 100 8957.71 microseconds per iteration both provide the same result for basic glob matches its even better on larger file structures time {fileutil::findByPattern [pwd] *.tcl} 10 2696438.9 microseconds per iteration time {util::rglob [pwd] *.tcl} 10 277771.5 microseconds per iteration