Human readable file size formatting

Ro December 26, 2012

File sizes can be hard to compare in an application, when you are presented with alternatively sizes in GB, MB, and KB. So I show all the sizes in the same way, always in MB. I feel this is a good tradeoff.

proc commify {x} {
  set trailer ""
  set pix [string last "." $x]
  if {$pix != -1} {
    # there is a decimal trailer
    set trailer [string range $x $pix end]
    set x [string range $x 0 [expr {$pix - 1}]]
  }
  
  set z {}
  foreach {a b c} [lreverse [split $x ""]] {
    lappend z [join [list $c $b $a] ""]
  }
  set ret [join [lreverse $z] ","]
  append ret $trailer
}

proc hformat {x} {

  # megabytes
  set q [expr {($x * 1.0) / pow(2,20)}]

  if {$q < 7} {
    # 0.XY show two decimal places
    set q [expr {entier($q * 100) / 100.0}]
  } else {
    # round it out, its big
    set q [expr {round($q)}]
  }


  return "[commify $q] MB"
}

Ok so here is some output to show you how this thing works:

(apps) 5 % hformat 93838188111
89,491 MB
(apps) 6 % hformat 19192
0.01 MB
(apps) 7 % hformat 1919249
1.83 MB
(apps) 8 % hformat 139101
0.13 MB
(apps) 9 % hformat 481883842001
459,560 MB
(apps) 10 % hformat 4818838420010393
4,595,602,436 MB
(apps) 11 % hformat 48188384200
45,956 MB
(apps) 12 % hformat 75002991
72 MB
(apps) 13 % hformat 750029914
715 MB
(apps) 14 % hformat 1048576 
1.0 MB
(apps) 15 % hformat 666,792,000
can't use non-numeric string as operand of "*"
(apps) 16 % hformat 666792000
636 MB
(apps) 17 % hformat 840957664
802 MB

Adding commas to numbers makes them easier to read, and showing the sizes in the same way (megabytes always) makes it easy to compare relative sizes. I don't like seeing 4.02 GB and then 179 MB as it's not immediately obvious which one is bigger. Of course using colors and graphics next to file sizes to show relative weight (size of file) is even better. But that's not what this page is about ;P


crn January 09, 2020

Here is a generic procedure for getting standard prefix :

proc HumanReadableUnit {value suffix} {
    if {$value == 0} {
        return "$value $suffix"
    }
    set mult 1
    if {$value < 0} {
        set mult -1
        set value [expr $value * $mult]
    }

    set log_n [expr {int( log( $value ) / log(1024) )}]
    set prefix [lindex [list "" "Ki" "Mi" "Gi" "Ti" "Pi" "Ei" "Zi" "Yi"] $log_n]
    set value [expr {$value / (pow(1024, $log_n))}]
    set value [expr $value * $mult]
    return "[format %.2f $value] ${prefix}${suffix}"
}