This is a relatively common request. Someone wants, for one reason or another, to determine how much diskspace is currently in use.

While on Unix one could say "exec /bin/du" , that doesn't do well for a cross platform solution.

Here's an attempt to provide some Tcl code to do this. However, I'm uncertain whether the calculation is correct. Perhaps some of my fellow Tcl'ers can look in here and determine whether my algorithm is correct.


 #! /usr/tcl84/bin/tclsh8.4
 # Name: du.tcl
 # Purpose: 
 # Given a directory, calcuate the number of bytes of plain files within
 # the directory
 # Author: [email protected]
 # Date: Sept. 26, 2002
 # Version: 1.0

 package require log 
 package require fileutil

 log::lvChannel debug stderr

 proc dirsize {directory} {
        if { [file exists $directory ] == 0 } {
                return 0
        }
        if { [file readable $directory ] == 0 } {
                return 0
        }
        set size 0
        set noaccess {}
        foreach element [glob -nocomplain -directory $directory -types f *] {
                set path [file join $directory $element]
                if { [file readable $path] } {
                        incr size [file size $path]
                } else {
                        lappend noaccess $path
                }
        }
        if { [llength $noaccess] != 0 } {
                log::log debug $noaccess
        }
        return $size
 }

 proc isdir {path} {
        return [file isdirectory $path]
 }

 proc dir_totalsize {directory} {
        if { [file exists $directory ] == 0 } {
                return 0
        }
        if { [file readable $directory ] == 0 } {
                return 0
        }

        set size 0
        set noaccess {}
        foreach element [::fileutil::find $directory isdir] {
                set path [file join $directory $element]
                if { [file readable $path] } {
                        incr size [dirsize $element]
                } else {
                        lappend noaccess $path
                }
        }
        if { [llength $noaccess] != 0 } {
                log::log debug $noaccess
        }
        return $size
 } 

 # Test out implementation

 if { [file exists /tmp/small] == 0 } {
        exec mkdir /tmp/small
        exec cp /etc/motd /tmp/small/motd
 }

 puts [format "Size of /tmp/small is %d" [dirsize /tmp/small] ]
 puts [format "Size of %s is %d" $env(HOME) [dirsize $env(HOME)] ]
 puts [format "Size of /not/present is %d" [dirsize /not/present] ]
 puts [format "Total size of %s is %d" $env(HOME) [dir_totalsize $env(HOME)] ]

See also du


Martin Lemburg - 27.09.2002:

The proc du in du is something complete different than "dirsize" and "dir_totalsize". But I tried out your procs and got results, that differ completely from the reality. I tried out following:

    % dirsize g:/programme
    22351
    % dir_totalsize g:/programme
    12651754

With the following proc "dirSize" ...

 proc dirSize {obj {recursive 0}} {
    set size    0;

    foreach subObj [glob -nocomplain \
                                [file join $obj *] \
                                [file join $obj {.[a-zA-Z0-9]*}]] {
        if {$recursive && [file isdirectory $subObj]} {
            incr size   [dirSize $subObj 1];
        } else {
            incr size    [file size $subObj];
        }
    }

    return $size;
 }

... I got following:

    % dirSize g:/programme
    22351
    % dirSize g:/programme 1
    279744410

My explorer tells me (inklusive of all hidden files/directories) 289.795.331 Bytes.

So something goes wrong in your proc "dir_totalsize"


Yes, unfortunately the dir_totalsize doesn't recurse. And I'm having computer problems today which prevent me from making adjustments to the script...


Okay, Version 2 attempts to recurse - but there still appears to be differences.


Martin Lemburg - 30.09.2002:

Sorry, but why so complicated? Why using tcllib, where it is not needed? Why using external libs, where it is no needed? The core tcl provides enough capabilities to solve this kind of problem "how much space is used by (or in) a directory". And the core tcl provides enough capabilities for "slim" solutions, without always reinventing the wheel! Especially this problem to recurse a structure to collect informations is a common problem, why using a lib if its more a design pattern problem? Isn't it useful to store a kind of code snippet instead of using external code, libraries, packages?


LV: Why use Tcllib ? Because why rewrite code that has already been written? When you want to go on a trip, do you first build a vehicle or sew your own clothes and build your own shoes? Many people have better things to do than to write every piece of code from scratch...


Martin Lemburg - it's not about rewriting from the scratch! I work in a Software Company, where people are suspicious about the usage from external libraries and packages, if they are not a bought product (warranties, updates, maintainance, ...). And if the products to be delivered to our customers should rely as less as possible on libraries, that are not from our company, we have to collect code snippets, patterns and reuse them. It is (in my eyes) not always useful to need the complete tcllib to be able to use other packages (e.g. fileutil), to solve simple problems. And if its a runtime dependent problem, it is perhabs much more worse to use external packages, that are not to be optimized, or that rely on a unknown count of other packages. And sometimes the usage of external packages/libraries is like (in German) "mit Kanonen auf Spatzen schießen" - "to shoot with cannons at sparrows".

That's why I like the Tcl'lers Wiki! It's the best chance to collect code (snippets) in a central place, to make it available to all Tcl'lers!

This could also be a discussion about "all batteries included". We ... and I think many other companies could only use pure tcl, without any other package! So some published solutions for common problems should be as simple and not rely on something other than pure core tcl.


RS I fully agree. Simple solutions (and many are simple in Tcl) need not be administered in a library on which one's app depends. I've earlier started to collect a library of utilities ("max", "lrevert", "every"... that kind of stuff), but recently I prefer to paste those in where I need them - or even reinvent them (or rewrite from memory). A single file is still the most robust unit of deployment, even if not a Starkit...


LV I never have understood this approach. Why cut and paste in code, resulting in many more places I have to fix when bugs are found, instead of creating a single copy in a library that then everyone uses and which only has to be fixed once, documented once, etc. Category Application