Version 9 of Comparing files in Tcl

Updated 2005-11-21 15:14:03

Recently a discussion occurred on the wiki's chat regarding how to compare two files.

While _some_ operating systems come with such utilities, to write portable code a Tcl developer needs some code of his/her own. Perhaps such code will eventually find its way into fileutil.

 westlife       Actually my requirement was to compare two files in binary mode. 
 westlife       My friend told me to take them in string using read command and compare them. 
 westlife       What do u think is it effeciant and correct way to compare two fils in binary mode ? 
 dkf    Two binary files?  That's easy enough.
 dkf    Use [read] to get the data in (making sure you've [fconfigure $chan -translation binary] first)
 dkf    And then use [string compare] or [string equal] or whatever.
 westlife       how can i do that dkf
 westlife       but is it efficient way ?
 dkf    proc readBinaryFile {filename} {
            set f [open $filename]
            fconfigure $f -translation binary
            set data [read $f]
            close $f
        }
 dkf    If your files are small enough (e.g. up to a few megabytes) that'll work just fine.
 arjen  If you read it in chunks (especially with large files), then quit as soon as you find a difference
 arjen  Use: read $f $chunksize
 dkf    If they're really big, you'll need to chunk it
 westlife       yes that's what wanted to say 
 dkf    msg x'ed 
 westlife       oh i.c
 westlife       thanx dkf
 dkf    Using chunks is slower if your files could fit into your (physical) memory, 
 dkf    but if they can't it is much faster.
 stevel Also, smaller chunks keep your UI responsive
 dkf    proc cmpFilesChunked {file1 file2 {chunksize 16384}} {
           set f1 [open $file1]; fconfigure $f1 -translation binary
           set f2 [open $file2]; fconfigure $f2 -translation binary
           while {1} {
              set d1 [read $f1 $chunksize]
              set d2 [read $f2 $chunksize]
              set diff [string compare $d1 $d2]
              if {$diff != 0 || [eof $f1] || [eof $f2]} {
                 close $f1; close $f2
                 return $diff
              }
           }
        }
 dkf    That's untested, but I think it'll work...
 westlife        thanx dkf
 lvirden        westlife, if you return, a suggestion - check the file sizes before beginning 
 lvirden        the file reading process - if the sizes are not equal, then the files are not equal.

Fred Limouzin (2004/03/14): I also use the CRC to compare two (binary) files, at least it works if you are only interested in the equal/different info. If you need to report the differences, then it's another story! So what I do is (need a package require crc32 for instance; assuming your specification allows):

  • if file sizes differ then files differ
  • else if crc32 differ then files differ
  • else files are equal

It dramatically improves the comparison speed.


See also


Category Discussion Category File