globfind is designed as a fast and simple alternative to fileutil::find.

See Also

Matthias Hoffmann - Tcl-Code-Snippets - misc - globx
getfiles cached
JOB - 2016-07-12 20:07:18: Relies on the same glob statement plus the ability to store search results in an additional cache file.


SEH 2008-01-20: globfind has been rewritten to achieve greater speed simplicity and function.


SEH 2006-07-23: fileutil::find is useful but it has several deficiencies:

  • On Windows, hidden files are mishandled.
  • On Windows, checks to avoid infinite loops due to nested symbolic links are not done.
  • On Unix, nested loop checking requires a "file stat" of each file/dir encountered, a significant performance hit.
  • The basedir from which the search starts is not included in the results, as it is with GNU find.
  • If the basedir is a file, it is returned in the result not as a list element (like glob) but as a string.
  • The fileutil::find calls itself recursively, and thus risks running into interp recursion limits for very large systems.
  • fileutil.tcl contains three separate instantiations of find for varying os's/versions. Maintenance nightmare.

globfind eliminates all the above deficiencies. It checks for nested symbolic links in a platform-independent way and scans directory hierarchies without recursion.

For speed and simplicity, globfind takes advantage of glob's ability to use multiple patterns to scan deeply into a directory structure in a single command, hence the name. Its calling syntax is the same as fileutil::find, so with a name change it could be used as a drop-in replacement:

After getting feedback and advice from AK, I decided to rewrite globfind to enhance simplicity, speed and function. You can now control search depth, and "prefilter" results by filename wildcard and file type.

The latest version includes sample application code, including a wrapper that duplicates much of the GNU find command syntax, and a directory sync utility.

I did some simple performance testing, and found that globfind is two to three times faster than the latest fileutil::find (ver. 1.13.3). I was also pleased to find that globfind is slightly faster than perl's standard directory traversal utility, File::Find.

For the performance tests, I did searches of the entire hard drive (13 GB of contents) of my WinXP notebook for all pdf files. Here are some sample results:

time {perl} 5171135507 microseconds per iteration
time {globfind C:/ {string match -nocase *.pdf} -type f} 5115472017 microseconds per iteration
time {globtraverse C:/ -pattern *.pdf} 597552737 microseconds per iteration
time {::fileutil::find C:/ {string match -nocase *.pdf}} 5309160279 microseconds per iteration
time {::fileutil::findByPattern C:/ -glob *.pdf} 5300450509 microseconds per iteration
time {globfind C:/ -pat *.pdf -type f} 595964709 microseconds per iteration

For easy comparison, relative results as a multiplier of the best performer:

FunctionTiming RatioNotes
globfind C:/ -pat *.pdf -type f1using prefilter only
globtraverse C:/ -pattern *.pdf1.02core prefilter routine
globfind C:/ {string match -nocase *.pdf} -type f1.20equivalent use of postfilter
perl testfind.pl1.78used find2perl to generate test script
::fileutil::findByPattern C:/ -glob *.pdf3.13
::fileutil::find C:/ {string match -nocase *.pdf}3.22

The actual search and prefilter functions of globfind are now in a separate proc called globtraverse. globfind wraps and calls globtraverse, then applies an optional postfilter command.

Testing performance of hard drive access is tricky, but multiple tests showed results within 5-10%. Still, globtraverse routinely showed up fractionally slower than globfind.

The perl script was generated using the utility find2perl:

% perl find2perl / -name *.pdf >

Below instructions are excerpted from the program file:

globfind.tcl --

Written by Stephen Huntley ([email protected])
License: Tcl license
Version 1.5

The proc globfind is a replacement for tcllib's fileutil::find

Usage: globfind ?basedir ?filtercmd? ?switches??


basedir - the directory from which to start the search.  Defaults to current directory.

filtercmd - Tcl command; for each file found in the basedir, the filename will be
appended to filtercmd and the result will be evaluated.  The evaluation should return
0 or 1; only files whose return code is 1 will be included in the final return result.

switches - The switches will "prefilter" the results before the filtercmd is applied.  The
available switches are:

        -depth   - sets the number of levels down from the basedir into which the 
                     filesystem hierarchy will be searched. A value of zero is interpreted
                     as infinite depth.

        -pattern - a glob-style filename-matching wildcard. ex: -pattern *.pdf

        -types   - any value acceptable to the "types" switch of the glob command.
                     ex: -types {d hidden}

Side effects:

If somewhere within the search space a directory is a link to another directory within
the search space, then the variable ::globfind::REDUNDANCY will be set to 1 (otherwise
it will be set to 0).  The name of the redundant directory will be appended to the
variable ::globfind::redundant_files.  This may be used to help track down and eliminate
infinitely looping links in the search space.

Unlike fileutil::find, the name of the basedir will be included in the results if it fits
the prefilter and filtercmd criteria (thus emulating the behavior of the standard Unix 
GNU find utility).


globfind is designed as a fast and simple alternative to fileutil::find.  It takes 
advantage of glob's ability to use multiple patterns to scan deeply into a directory 
structure in a single command, hence the name.

It reports symbolic links along with other files by default, but checks for nesting of
links which might otherwise lead to infinite search loops.  It reports hidden files by
default unless the -types switch is used to specify exactly what is wanted.

globfind may be used with Tcl versions earlier than 8.4, but emulation of missing
features of the glob command in those versions will result in slower performance.

globfind is generally two to three times faster than fileutil::find, and fractionally
faster than perl's File::Find function for comparable searches.

The filtercmd may be omitted if only prefiltering is desired; in this case it may be a 
bit faster to use the proc globtraverse, which uses the same basedir value and 
command-line switches as globfind, but does not take a filtercmd value.

If one wanted to search for pdf files for example, one could use the command:

        globfind $basedir {string match -nocase *.pdf}

It would, however, in this case be much faster to use:

        globtraverse $basedir -pattern *.pdf



proc PdfOnly {fileName} {
    return [string equal [string tolower [file extension $fileName] .pdf]

set fileList [globfind $dir PdfOnly] 

SEH 2007-03-17: A simpler alternative:

set fileList [globfind $dir {string match -nocase *.pdf}]


gavino 2011-03-24:

This won't work for me. I noticed when I posted code that I had to indent all lines 1 space or I lost formatting. Perhaps this happened here? nice job if it works. I am trying to source globfind.tcl a file where I copied the above code. I get errors trying to run the pdf examples.

SEH: sorry, I shouldn't have left out-of-date code here for so long. If you have trouble with the latest version linked to above, let me know.

SEH: The code is in its own namespace called ::fileutil::globfind. So in order to run the command you must either run "::fileutil::globfind::globfind" or import the commands into the current namespace; i.e., "namespace import ::fileutil::globfind::*". After the import you can simply use the command "globfind".

AQI 2016-07-16: rglob is another much faster alternative to fileutil::find.

proc rglob {basedir pattern} {

    # Fix the directory name, this ensures the directory name is in the
    # native format for the platform and contains a final directory seperator
    set basedir [string trimright [file join [file normalize $basedir] { }]]
    set fileList {}

    # Look in the current directory for matching files, -type {f r}
    # means ony readable normal files are looked at, -nocomplain stops
    # an error being thrown if the returned list is empty
    foreach fileName [glob -nocomplain -type {f r} -path $basedir $pattern] {
        lappend fileList $fileName

    # Now look for any sub direcories in the current directory
    foreach dirName [glob -nocomplain -type {d r} -path $basedir *] {
    # Recusively call the routine on the sub directory and append any
    # new files to the results
        set subDirList [rglob $dirName $pattern]
        if { [llength $subDirList] > 0 } {
            foreach subDirFile $subDirList {
                lappend fileList $subDirFile
    return $fileList


time {util::rglob [pwd] *.tcl} 100

1505.3 microseconds per iteration

time {fileutil::findByPattern [pwd] *.tcl} 100

8957.71 microseconds per iteration

both provide the same result for basic glob matches

its even better on larger file structures

time {fileutil::findByPattern [pwd] *.tcl} 10

2696438.9 microseconds per iteration

time {util::rglob [pwd] *.tcl} 10

277771.5 microseconds per iteration

Page Authors

Gerald Lester