Version 13 of directory size

Updated 2005-04-11 18:30:14

Fred Limouzin: 2005/04/11.

The file size command currently (i.e. Tcl8.4.9) does not support directories. A pure-Tcl solution involves adding recursively the size of files in the current and in sub-directories.

Here's a Tk/GUI script that I wrote this week-end, which fits my need. I wanted a gradient-color indication for the size of directories. You can position the orange/yellow scale (size at which the color indicator is orange/yellow; below that value the indicator is gradient going toward green (size=0)) and the red scale (size from which the indicator is red!). A scale selector also lets you decide how many levels of sub-directories to display.

Note: at times it returns slight discrepencies with the info returned by the "properties" context-menu (on WinXp that is; haven't tested on unix yet). Since hidden files are covered in the script below, the discrepencies may be caused by system files. Generally speaking these differences -when they exist - are negligible.

Note: this doesn't yet support links (especially links on dir). Should be easy to add as long as we consider that a link on a dir has a size of 0 (in fact it's usually the length of the string representing the path of the object linked), but remember the script was used to find big directories, so again I take that as negligible!). Bear in mind this script fited my needs ;-).

Screenshot: http://dire.straits.free.fr/vertigo/dirsize2.gif

The code can also be found at http://dire.straits.free.fr/vertigo ([L1 ]).

 #!/bin/sh
 # Frederic Limouzin Copyrights (c)2005; All rights reserved \
 exec tclsh "$0" ${1+"$@"}

 package require Tk

 set tcl_precision 17

 catch {console hide} r ; unset r

 ###############################################################################

 set dir(root) C:/    ;# [file join C:/ Temp]
 set dir(log)  [file join [pwd] [file tail [file rootname $argv0]].log]
 set dir(sel)  $dir(root)

 if {[file exists $dir(log)]} {
     file rename -force $dir(log) $dir(log).bck
 }
 set Log [open $dir(log) w]

 set ::MAXLVL 2 ;# level of sub dir to display
 set ::KILO 1024.0
 set ::MID 50  ;# 50% (orange) at mid in Mbytes
 set ::MAX 250 ;# 100% (red) at and above max in Mbytes

 ###############################################################################

 button .xit -text {Exit} -command {Quit}
 frame .d
 button .d.dirsel -text {Select Directory} -command {SelDir} -font Courier
 entry .d.dir -textvariable dir(sel)
 button .sze -text {Calc Size} -command {DirSize} -relief raised -font Courier
 scale .s -from 1 -to 30 -length 300 -resolution 1    \
   -label {Max SubLevel to display:} -variable MAXLVL -command {} \
   -orient horizontal -tickinterval 4 -showvalue true -relief groove
 scale .smid -from 1 -to [expr {int(1.0 * $::KILO)}] -length 300 -resolution 1 \
   -label {50% (orange) at size (in Mbytes):} -variable MID -command {} \
   -orient horizontal -tickinterval 100 -showvalue true -relief groove
 scale .smax -from 1 -to [expr {int(1.0 * $::KILO)}] -length 300 -resolution 1 \
   -label {100% (red) at size (in Mbytes):} -variable MAX -command {} \
   -orient horizontal -tickinterval 100 -showvalue true -relief groove
 frame .t
 set Txt .t.txt
 text $Txt -width 80 -height 15 -wrap none -font Courier \
      -yscrollcommand {.t.scrolly set} -xscrollcommand {.t.scrollx set}
 scrollbar .t.scrollx -relief flat -orient horizontal -command {$Txt xview}
 scrollbar .t.scrolly -relief flat -orient vertical   -command {$Txt yview}
 label .cprght -text {Copyrights (c)2005 Fred-Phenix, Fred Limouzin}

 pack .d.dirsel -side left
 pack .d.dir -side right -fill x -expand true
 pack .cprght -side bottom -fill x
 pack .t.scrollx -side bottom -fill x
 pack $Txt -side left -fill both -expand true
 pack .t.scrolly -side right -fill y
 pack .xit -side bottom -fill x
 pack .t -side bottom -fill both -expand true
 pack .d -side top -fill x
 pack .smid -side top -fill x
 pack .smax -side top -fill x
 pack .s -side top -fill x
 pack .sze -side top

 ###############################################################################
 # for input 12345678
 # if m=0 : result = 12345678 bytes
 # if m=1 : result = 12,345,678 bytes (to be done)
 # if m=2 : result = 11.77 Mbytes
 proc Norma {v {m 0}} {
     if {$m == 0} {
         set rv "[expr {wide($v)}] bytes"
     } elseif {$m == 1} {
         ;# tbd
         set rv {to be done}
     } else {
         array set unitArr {0 bytes 1 kbytes 2 Mbytes 3 Gbytes 4 Tbytes}
         set idx 0
         set nv $v
         while {true} {
             set dv $nv
             set nv [expr {1.0 * wide($dv) / $::KILO}]
             if {$nv > 1.0} {
                 incr idx
             } else {
                 break
             }
         }
         set rv [format {%3.2f %s} $dv $unitArr($idx)]
     }
     return $rv
 }

 ###############################################################################
 proc Clamp {v {min 0} {max 255}} {
     if {$v < $min} {
         return $min
     } elseif {$v > $max} {
         return $max
     } else {
         return $v
     }
 }

 ###############################################################################
 proc CalcColor {y} {
     set y [Clamp $y 0.0 1.0]
     set blu 0
     set gre [Clamp [expr {int(255.0 * (1.0 - $y))}]]
     set red [Clamp [expr {int(255.0 * 2.0 * $y)}]]
     return [format {#%02X%02X%02X} $red $gre $blu]
 }

 ###############################################################################
 proc GetColor_Square {x} { ;# x in bytes
     set x [Clamp [expr {1.0 * wide($x) / ($::KILO * $::KILO)}] 0.0 $::MAX] ;# x in Mb
     set a [expr {(1.0 * $::MAX - (2.0 * $::MID)) / (2.0 * $::MID * $::MAX * ((1.0 * $::MID) - $::MAX))}]
     set b [expr {(1.0 / (2.0 * $::MID)) - (1.0 * $a * $::MID)}]
     set y [expr {(1.0 * $a * wide($x * $x)) + (1.0 * $b *$x)}]
     return [CalcColor $y]
 }

 ###############################################################################
 proc GetColor_Linear {x} { ;# x in bytes
     set x [Clamp [expr {1.0 * wide($x) / ($::KILO * $::KILO)}] 0.0 $::MAX] ;# x in Mb
     if {$x < $::MID} {
         set a [expr {1.0 / (2.0 * $::MID)}]
         set b 0.0
     } else {
         set a [expr {1.0 / (2.0 * ($::MAX - $::MID))}]
         set b [expr {1.0 - ($a * $::MAX)}]
     }
     set y [expr {(1.0 * $a * wide($x)) + (1.0 * $b)}]
     return [CalcColor $y]
 }

 ###############################################################################
 proc GetColor {x {mode 0}} {
     if {$mode == 0} {
         return [GetColor_Linear $x]
     } else {
         return [GetColor_Square $x]
     }
 }

 ###############################################################################
 proc SelDir {} {
     global dir
     set tmp [tk_chooseDirectory -title "Choose Root directory" -initialdir $dir(sel)]
     if {$tmp ne {}} {
         set dir(sel) $tmp
     }
     return $dir(sel)
 }

 set Tagidx 0

 ###############################################################################
 proc log {txt {clr #FFFFFF}} {
     global Log
     global Txt
     global Tagidx
     puts $Log $txt
     ;#puts $txt
     $Txt tag configure tagn($Tagidx) -background $clr
     $Txt insert end "___" tagn($Tagidx)
     $Txt insert end $txt\n
     $Txt see end
     incr Tagidx
     update idletasks
     return 0
 }

 ###############################################################################
 proc DirSize_Recurs {dir {level 0}} {
     set nextLevel [expr {$level + 1}]
     catch {cd $dir} res
     if {$res ne {}} {
         return 0
     }
     ;#set dirlst [glob -nocomplain *]
     set dirlst [concat [glob -nocomplain *] [glob -type hidden -nocomplain *]]
     set size 0
     foreach e $dirlst {
         set ndir [file join $dir $e]
         if {![file exists $ndir]} {
             continue
         }
         if {[file isdirectory $ndir]} {
             set s [DirSize_Recurs [file join $dir $ndir] $nextLevel]
         } else {
             set s [file size $ndir]
         }
         set size [expr {wide($size + $s)}]
     }
     if {$level < $::MAXLVL} {
         set clr [GetColor $size]
         log [format {%14s %20s : %s} [Norma $size 2] ([Norma $size]) $dir] $clr
     }
     return $size
 }

 ###############################################################################
 proc DirSize {} {
     global dir
     log [string repeat - 60]
     DirSize_Recurs $dir(sel)
     return 0
 }

 ###############################################################################
 proc Quit {} {
     global Log
     close $Log
     exit
 }

LV What you have written here is a useful function. However, it isn't what _I_ would think of if someone asked me the file size of a directory. Instead, I'd expect that they wanted to know the number of bytes that the directory's name/inode (on unix anyways) contained. On Unix, the above functionality would be provided by the du command, right?

Fred: Hi Larry. I am not sure I got your point, but yes what I wanted was the disk usage (with links not being followed). In other words the sum of the files' sizes in bytes in the directory and its sub-directories (again, not following links). (I didn't want the reserved space in blocks, but the actual used space.) I usually work under Solaris, but I have to admit that that script was for my notebook under WinXP, so I haven't thought through the issues in unix. It did what I wanted, but I'm not saying it'll fit everyone's needs.

I just noticed that your du was a link (I first thought you refered to the unix command), and it is more likely that it was indeed what I was after. Oh well! :-).


KPV: Here's a simpler way of computing all the bytes used in a directory and all its subdirectories. It uses the fileutil module from tcllib.

  package require fileutil
  set total 0
  foreach fileName [::fileutil::find .] {
    incr total [file size $fileName]
  }

NB. this has one draw back in that it generates a list of all files which can be expensive. A better way would be to use the filtercmd option to ::fileutil::find but there's a design bug in that interface in that the filtercmd only gets passed the file name with no directory info.


'Fred' 20050411: So long as we can agree on what directory size means, would a Tcl command doing that be useful? Is it worth TIP'ing it? Could this functionality be added to the file size command? Even if the C code itself has to be a recursive function adding up the sub-files, I still think it'd be worth having it available as a command rather than the above recursive procedure. Comments/Points-of-view most welcome!


escargo 11 Apr 2005 - It might be worth taking a look at Tree Size [L2 ] for comparison purposes.


Category Application