GNU Zip (or ''gzip'' for short) is a stream compression program (like the UNIX [compress] program), and not an archiver (like the DOS [pkzip] program) that uses an ''unpatented'' (and unpatentable) algorithm for its compression. The program [bzip2] can compress even more, but takes more memory to do so. The [zlib] library uses the same algorithms, and can be used to process or produce gzipped files (IIRC, though there might be a header too.) This governed by three [RFC]'s * RFC 1950 [http://www.gzip.org/zlib/rfc-zlib.html] - Zlib Compressed Data Format * RFC 1951 [http://www.gzip.org/zlib/rfc-deflate.html] - Deflate Compressed Data format * RFC 1952 [http://www.gzip.org/zlib/rfc-gzip.html] - Gzip File Format See also http://en.wikipedia.org/wiki/Gzip See [GASP], [tkArchive], [TkZip], Also take a look at [bzip2] and [LZMA] for better compression rates Are there any Tcl bindings for [zlib]? '''18Aug04 [PS]''' Yes there is! tclkit has it by default and I created an extension from that. See the [zlib] page. ---- '''gunzip a file with [zlib] and Tcl''' ====== proc gunzip { file {outfile ""} } { package require zlib # Gunzip the file # See http://www.gzip.org/zlib/rfc-gzip.html for gzip file description set in [open $file r] fconfigure $in -translation binary -buffering none set id [read $in 2] if { ![string equal $id \x1f\x8b] } { error "$file is not a gzip file." } set cm [read $in 1] if { ![string equal $cm \x8] } { error "$file: unknown compression method" } binary scan [read $in 1] b5 FLAGS puts $FLAGS foreach {FTEXT FHCRC FEXTRA FNAME FCOMMENT} [split $FLAGS ""] {} binary scan [read $in 4] i MTIME set XFL [read $in 1] set OS [read $in 1] if { $FEXTRA } { binary scan [read $in 2] S XLEN set ExtraData [read $in $XLEN] } set name "" if { $FNAME } { set XLEN 1 set name [read $in $XLEN] set c [read $in 1] while { $c != "\x0" } { append name $c set c [read $in 1] } } set comment "" if { $FCOMMENT } { set c [read $in 1] while { $c != "\x0" } { append comment $c set c [read $in 1] } } set CRC16 "" if { $FHCRC } { set CRC16 [read $in 2] } set cdata [read $in] close $in binary scan [string range $cdata end-7 end] ii CRC32 ISIZE set data [zlib inflate [string range $cdata 0 end-8]] if { $CRC32 != [zlib crc32 $data] } { error "gunzip Checksum mismatch." } if { $outfile == "" } { set outfile $file if { [string equal -nocase [file extension $file] ".gz"] } { set outfile [file rootname $file] } } if { [string equal $outfile $file] } { error "Will not overwrite input file. sorry." } set out [open $outfile w] fconfigure $out -translation binary -buffering none puts -nonewline $out $data close $out file mtime $outfile $MTIME } ====== [MHo] 2021-07-14: I always see '''00000''' on stdout, everytime I call the procedure ? '''18Aug04 [PS]''' ---- [LES]: Would someone tell me HOW this is better than '''[[exec gzip filename]]'''? [PS]: gzip might not be installed? And with a small tweak, you'd just get the file content - put that together with vfs::tar and you might be able to mount tar.gz files... [SRIV] Stock Windows installs don't have gzip. Think cross platform. "Better" is in the eye of the beholder. Nice work. [PHo] 2025-02-01: Seeing some 'Checksum mismatch' errors which are resolved by scanning the checksum bytes as unsigned: ====== binary scan [string range $cdata end-7 end] iu1iu1 CRC32 ISIZE ;# zlib crc32 returning unsigned values, and no negative data sizes ====== ---- [DKF]: Here's a cheap way to invoke ''gzip'' on Windows. Note that just using '''exec gzip -c <<$d''' does not work because of translation issues. ====== proc gzip d { set data [open foo.tmp w] fconfigure $data -translation binary puts -nonewline $data $d close $data set f [open "|gzip -c <foo.tmp" r] fconfigure $f -translation binary set d [read $f] close $f after 100 ;# Ugly hack to give gzip time to exit so we can kill foo.tmp on Windows file delete foo.tmp return $d } ====== ---- '''14Dec06 [gl]''' : Here is a way to read and write gzip .gz files transparently and on the fly -- including channels like "stdin" -- with the help of [Trf]: ====== package require Trf namespace eval gz { variable CRC # Attach to a channel for writing -- i.e., write .gz header, enable compression proc attach_w {f} { l for writing -- i.e., write .gz header, enable compression # Write header puts -nonewline $f [binary format "H*iH*" "1f8b0800" [clock seconds] "0003"] set CRC [binary format x4] # Init/attach compression zip -attach $f -mode compress -nowrap 1 fconfigure $f -translation binary -encoding binary # Init/attach CRC crc-zlib -attach $f -mode transparent -write-destination ::gz::CRC($f) -write-type variable fconfigure $f -translation binary -encoding binary fconfigure $f -translation binary -encoding binary return $f } # Detach from a channel for writing -- i.e., write .gz footer incl. CRC proc detach_w {f} { variable CRC set SIZE [tell $f] unstack $f ; # CRC unstack $f ; # gzip puts -nonewline $f $CRC($f) puts -nonewline $f [binary format "i" [expr $SIZE % 0x100000000]] } # Attach to a channel for reading -- i.e., read and check .gz header, enable decompression proc attach_r {f} { # (using code from [http://wiki.tcl.tk/6175]) set id [read $f 2] if { ![string equal $id \x1f\x8b] } { error "GZip channel $f: not a gzip file." } set cm [read $f 1] if { ![string equal $cm \x8] } { error "GZip channel $file: unknown compression method." } binary scan [read $f 1] b5 FLAGS puts $FLAGS foreach {FTEXT FHCRC FEXTRA FNAME FCOMMENT} [split $FLAGS ""] {} binary scan [read $f 4] i MTIME set XFL [read $f 1] set OS [read $f 1] if { $FEXTRA } { binary scan [read $f 2] S XLEN set ExtraData [read $f $XLEN] } set name "" if { $FNAME } { set c [read $f 1] while { $c != "\x0" } { append name $c set c [read $f 1] } } set comment "" if { $FCOMMENT } { set c [read $f 1] while { $c != "\x0" } { append comment $c set c [read $f 1] } } set CRC16 "" if { $FHCRC } { set CRC16 [read $f 2] } # Init/attach decompression zip -attach $f -mode compress -nowrap 1 fconfigure $f -translation binary -encoding binary # Init/attach CRC crc-zlib -attach $f -mode transparent -read-destination ::gz::CRC($f) -read-type variable fconfigure $f -translation binary -encoding binary return $f } # Detach from a channel for reading -- i.e., check .gz footer incl. CRC proc detach_r {f} { nel for reading -- i.e., check .gz footer incl. CRC variable CRC set cmpSize [expr [tell $f] % 0x100000000] binary scan $CRC($f) i cmpCRC unstack $f ; # CRC unstack $f ; # gzip binary scan [read $f 4] i gzCRC binary scan [read $f 4] i gzSize binary scan [read $f 4] i gzSize if {$gzCRC != $cmpCRC} { error "GZip channel $f: CRC mismatch." } if {$gzSize != $cmpSize} { error "GZip channel $f: Size mismatch." } } } # Demo program: # Decompress to stdout: gztest.tcl filename.gz # Compress to stdout: gztest.tcl filename if {[llength $argv] != 1} { puts "Usage: gztest.tcl filename" exit -2 } set fn [lindex $argv 0] set f [open $fn r] fconfigure $f -translation binary if {[string match "*.gz" $fn]} { # Is .gz ::gz::attach_r $f fcopy $f stdout ::gz::detach_r $f } else { # Is not .gz ::gz::attach_w stdout fcopy $f stdout ::gz::detach_w stdout } ====== ---- To save people headaches working out how to do this for HTTP streams: ====== set gzip [binary format "H*iH*" "1f8b0800" [clock seconds] "0003"] append gzip [zlib deflate $content] append gzip [binary format i [zlib crc32 $content]] append gzip [binary format i [string length $content]] set content-encoding gzip ====== ([CMcC] modded this snipped 10Jul07 after much head-scratching and some experimentation) [BAS] I was able to send gzipped content by doing: ====== fconfigure $chan -translation binary set gzip [zlib gzip $content -header [list crc [zlib crc32 $content] time [clock seconds] os 3]] set content-encoding gzip ====== `os 3` just means it was generated from Unix OS. <<categories>> Application | Compression | File